%J‘  Human 
^  Systems 

Information 

Analysis 

Center 

SO 

STATE  OF  THE  ART  REPORT 

December  2000 

The  Process  of  Physical  Fitness 
Standards  Development 

Distribution  Statement  A: 
Approved  for  public  release; 
distribution  is  unlimited. 


Editors: 

Stefan  Constable 
Barbara  Palmer 


REPORT  DOCUMENTATION  PAGE 


Form  Approved 
OMB  No.  074-0188 


K Me  reporting  burden  fat  aii«  collection  of  information  is  estimated  to  avenge  I  fioiir  per  rcsjmnse,  itufuding  the  time  for  reviewing  ■rmtiucTions,  searching  cxiaring  <u;a  sources  gathering  and  suuntahunu  die  data  needed, 
and  (OMpfetina  and  reviewing  dutedfaf&M  of  information.  Send  mnirntnu  regarding  this  burden  estimate  or  any  other  aspect  of  this  colicrtiou  of  intonuadon.  including  suggestions  for  reducing  thin  burden  u>  Washington 
Headquarters  Services,  Directorate  tor  Information  Operations  and  Reports,  1 215  Jefferson  Davis  Highway,  Suite  1204,  Arlington.  VA  22202-4502,  wd  to  the  Office  of  Management  and  Budget,  Paperwork  Rcducrion 
Project  (0704-0188),  Washington,  DC  2Q5CO 


1.  AGENCY  USE  ONLY  {Leave  blank)  2.  REPORT  DATE  3.  REPORT  TYPE  AND  DATES  COVERED 

18  December  2000  State-of-the-A.it  Report  1 8  December  2000 


4.  TITLE  AND  SUBTITLE 

The  Process  of  Physical  Fitness  Standards  Development 


5.  FUNDING  NUMBERS 
SP0700-98-D-4001 


6.AUTHORIS) 

Constable,  S.H.  and  Palmer,  B.  (Eds.) 


7.  PERFORMING  ORGANIZATION  NAME(S)  AND  ADDRE  SS(ES) 

HS1AC  Program  Office 
AFRL/HEC/HSIAC,  Bldg.  196 
2261  H  Street 
YVPAFB,  OH  45433-7022 


8.PERFORM1NG  ORGANIZATION 
REPOR1  NUMBER 
IISIAC-SOAR-2000-001 


9.  SPONSORING/MONITORING  AGENCYNAME(S)  AND  ADDRESS(ES) 


Performance  Enhancement  Division 
USAF  School  of  Aerospace  Medicine 
311th  Human  SystemsWing 
Brooks  Air  Force  Rase.TX  78235-5252 

II.  SUPPI.IVMEN'TARY  NOTES 


10.  SPONSORING/MONITORING 
AGENCY  REPORTNUMBER 


Defense  Technical  Information  Center 
ATTN:  DoD  IAC  Program  Office 
(DTIC-AI) 

8725John  J.  Kingman  Road,  Suite0944 
Fort  Belvoir,  VA  22060-6218 


I2a.  DISTRIBUTION/  AVAILABILITY  STATEMENT 
A  pproved  for  public  release;  distribution  is  unlimited. 


13b-  DISTRIBUTION  CODE 

A 


Btijia  IXMMi  SEME! 


physical  fitness;  standards;  cut-point;  health  based  fitness;  physical  fitness  standards; 
occupational  specialties;  physical  demands 


17.  SECURITY 
CLASSIFICATION 
OF  REPORT 

UNCLASSIFIED 


18.  SECURITY 
CLASSIFICATION 
OF  THIS  PAGE 

UNCLASSIFIED 


19.  SECURITY  CLASSIFICATION 
OF  ABSTRACT 

UNCLASSIFIED 


15,  NUMBER  OF  PAGES 
288 


16.  PRICE  CODE 


20.  LIMITATION  OF 
ABSTRACT 


Unlimited 


Standard  Form  298  (Rev.2.89) 

Prescribed. by  ANSI  Srd.  UP-IS 
298- J  02 


Human  Systems  IAC  SOAR,  2000 


Contributors 


Stefan  Constable,  Ph.D. 

Chief,  Performance  Enhancement  Division 

US  AF  School  of  Aerospace  Medicine 

2602  West  Gate  Road 

Brooks  Air  Force  Base,  TX  78235-5252 

Stefan.Constable@brooks.af.mil 

Deborah  L.  Gebhardt,  Ph.D. 

Human  Performance  Systems,  Inc. 

5000  Sunnyside  Avenue,  Suite  203 
Beltsville,  MD  20705 
hpsdgebhardt  @  erols  .com 

James  A.  Hodgdon,Ph.D. 

Human  Performance  Center 

Naval  Health  Research  Center 

P.O.  Box  85122 

San  Diego,  CA  92186-5122 

hodgdon@nhrc.navy.mil 

Andrew  S.  Jackson, P.E.D. 

Department  of  Health  and  Human  Performance 
University  of  Houston 
Houston,  TX  77004 
Ajackson@jetson.uh.edu 

Barbara  Palmer 

AFRL/HEC/HSIAC 
2261  Monahan  Way,  Building  196 
Wright-Patterson  Air  Force  Base,  O  H  45433-7022 
Barbara.Palmer@wpafb.af.mil 

Mark  Rayson,  Ph.D. 

Optimal  Performance  Fimited 
Old  Chambers 
93/94  West  Street,  Farnham 
Surrey  GU97EB,  United  Kingdom 
Mark  @  optimalperformance  .co.uk 


Human  Systems  IAC  SOAR,  2000 


iv 


Human  Systems  IAC  SOAR,  2000 


The  Process  of  Physical  Fitness 
Standaids  ev 


Edited  by 
Stefan  Constable 

USAF  School  of  Aerospace  Medicine 
Brooks  Air  Force  Base,  TX  78235-5252 

Barbara  Palmer 

Human  Systems  Information  Analysis  Center 
Wright-Patterson  Air  Force  Base,  OH  45433-7022 


Human  Systems  Information  Analysis  Center 
Wright-Patterson  Air  Force  Base,  Ohio,  2000 


Human  Systems  IAC  SOAR,  2000 


v 


Description  of  cover  design 


The  real  goal  of  any  testing  standard  approach  is  to  maximize  the  correct  predictions  of  (job) 
success  or  failure  (cells  A  or  D)  and  minimize  the  incorrect  assessments  (cells  B  and  C).  Various 
statistical  tools  are  available  to  assist  in  the  evaluation  of  test  accuracy.  This  is  a  notional  example 
of  a  contingency  table  and  is  analogous  to  the  classical  truth  table  used  in  testing  the  null  hypoth¬ 
esis  (Hq)  for  Type  I  (ajind  Type  II  (13)  error.  As  the  illustration  indicates,  factors  such  as  sensitiv¬ 
ity,  specificity,  and  predictive  values  covary  with  the  ratio  of  correct  decisions  to  the  total  number 
of  possible  decisions.  Overall,  this  notional  depiction  well  represents  a  major  theme  of  this 
SOAR — the  primary  goal  of  developing  valid  test  standards  can  become  surprisingly  perplexing. 
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Preface 


Over  the  years,  there  has  been  considerable  interest  both  in  job  performance  and  standards  for 
physical  fitness,  especially  within  the  Armed  Services.  Unlike  most  of  the  learned  scholars  who 
have  contributed  to  this  compendium,  my  association  and  interest  in  performance/fitness  standards 
are  more  recent.  However,  as  my  exposure  to  this  field  progressed,  I  became  more  and  more 
intrigued,  both  with  the  supporting  science,  as  well  as  the  practical  implementation  of  the  fitness 
standards  process  itself.  This  interest  and  quest  for  more  information  ultimately  led  to  the  evolu¬ 
tion  of  this  present  State-of-the-Art  Report  (SOAR). 

Most  of  us  have  a  general  acceptance  regarding  a  level  of  required  mental  performance  for  entry 
or  advancement  in  the  formal  education  process.  And  as  pointed  out  early  in  this  book,  various 
organizations  have  sought  to  establish  both  mental  and  physical  standards  forjob  candidate  assign¬ 
ment  and  worker  retention,  in  order  to  ensurejob  performance  and  safety.The  processes, however, 
are  hardly  as  straightforward  or  black-and-white  as  most  might  envision.  In  fact,  it  has  been  rec¬ 
ommended  that  perhaps  both  good  science  and  goodjudgment  are  required  in  equal  measure.  Of 
course,  the  incorporation  of  standards  for  acceptable  performance  on  tests  of  physical  capacity 
should  be  scientifically  (and  legally)  defensible, which  is  sometimes  a  lofty  goal!  However,  the  "rig¬ 
ors  of  review’’  may  sometimes  be  relaxed  in  the  military  environment.  For  example,  performance  on 
current  military  fitness  tests  often  does  not  correlate  well  with  task-specificjob  requirements,  espe¬ 
cially  those  involving  muscular  strength,  i.e.,  heavy  lifting  or  carrying.  Yet  relativelyfew  steps  have 
been  implemented  by  the  services  to  update  and  improve  their  approach  to  the  fitness  standards 
process,  particularly  with  regard  to  occupational  and  health  considerations. Therefore,  an  underly¬ 
ing  theme  in  this  text  is  that  the  scientific  process  to  establish  defensible  standards  can  be  complex, 
and  subject  to  interpretation  and  challenge.  More  specifically,  it  may  not  always  be  possible  to 
achieve  the  desired  degree  of  test  validity.  For  example,  cost-benefit  concerns  and  resource  consid¬ 
erations  may  be  overly  constraining  to  the  expected  outcomes.  So  in  practice,  the  process  is  often 
varied  or  abbreviated  and  may  at  times,  involve  rather  arbitrary  decisions. 

This  reference  document  strives  to  provide  an  organized  base  of  knowledge  concerning  the  pri¬ 
mary  issues  of  standards  development  as  they  might  apply  to  both  the  military  and  civilian  envi¬ 
ronments.  Much  of  the  material  is  technically  in-depth,  but  certain  segments  lend  themselves  to 
larger  target  audiences  as  well.  Chapters  written  by  academic,  industry,  or  military  subject-matter 
experts  cover  all  of  the  key  topics  which  are  relevant  to  this  field.  These  specific,  contributing 
authors  bring  a  unique  breadth  of  background  and  experience  from  the  academic,  military,  indus¬ 
trial  and  laboratory  fields.  I  truly  believe  this  review  is  the  only  one  of  its  kind  published  to  date. 

Stefan  Constable 
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About  lum  Syst  IAI 


The  Human  Systems  Information  Analysis  Center  (HSIAC)  is  the  gateway  to  worldwide 
sources  of  up-to-date  human-factors  information  and  technologies  for  designers,  engineers, 
researchers,  and  human-factors  specialists.  HSIAC  provides  a  variety  of  products  and  services  to 
Government,  industry,  and  academia  promoting  the  use  of  ergonomics  in  the  design  of  human- 
operated  and  manned  systems. 

HSIAC’s  primary  objective  is  to  acquire,  analyze,  and  disseminate  timely  information  about 
ergonomics.  On  a  cost-recoverybasis,  HSIAC  vilL  perform  the  following  functions — 

•  Distribute  human-factors  and  ergonomics  technologies  and  publications. 

•  Conduct  customized  bibliographic  searches  and  reviews. 

•  Prepare  state-of-the-art  reports  and  critical  reviews. 

•  Conduct  specialized  analysis  evaluations. 

•  Organize  and/or  conduct  workshops  and  conferences. 


HSIAC  is  a  Department  of  Defense  Information  Analysis  Center  sponsored  by  the  Defense 
Technical  Information  Center,  Fort  Belvoir,  Virginia.  It  is  technically  managed  by  the  Air  Force 
Research  Laboratory  Human  Effectiveness  Directorate,  Wright-Patterson  Air  Force  Base,  Ohio, 
and  operated  by  Booz-Allen  Sc  Hamilton,  McLean,  Virginia. 
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Foreword 


Requirements  or  standards  for  physical  fitness  within  the  Armed  Services  have  received  consid¬ 
erable  attention  and  focus  in  recent  years.  Reasons  for  this  include  advancement  in  the  physiology 
and  medicine  underlying  physical  fitness,  expanded  research  on  the  development  and  practice  of  fit¬ 
ness,  expanded  numbers  of  women  in  the  military,  and  the  shift  in  the  role  that  physical  fitness  plays 
in  military  occupations  and  missions.  Despite  this  enhanced  focus  on  military  physical  fitness,  rela¬ 
tively  few  steps  have  been  implemented  by  the  services  to  update  and  expand  fitness  standards,  par¬ 
ticularly  in  regard  to  occupational  and  health  needs.  The  reluctance  of  the  training  and  personnel 
policy  communities  to  expand  fitness  requirements  for  military  occupations  may  in  part  be  due  to 
the  lack  of  an  organized  and  published  base  of  knowledge  concerning  the  primary  issues  of  standards 
development  as  they  would  apply  to  the  military  services.  This  landmark  report  finally  responds  to 
this  important  need  by  presenting  authoritative  chapters  on  the  key  topics  of  physical  standards 
development.  Each  chapter,  written  by  foremost  military  and  industry  experts  in  their  specialties, 
presents  an  in-depth  analysis  of  each  major  aspect  that  challenges  both  military  and  industrial  devel¬ 
opers  of  fitness  requirements.  Each  topic,  from  job  analysis  to  legal  issues,  provides  both  a  theoreti¬ 
cal  basis  as  well  as  a  practical  guide  to  developers  and  policy  makers  of  physical  fitness  requirements, 
both  military  and  industry.  This  report,  by  bringing  together  under  one  cover  all  critical  topics 
regarding  the  setting  of  standards,  will  make  a  significant  contribution  to  implementing  the  proper 
role  of  physical  fitness  to  occupational  classification  and  utilization  and  is  recommended  reading  to 
all  those  working  in  the  various  disciplines  of  physical  fitness  development. 

James  A.  Vogel,  Ph.D. 

Retired,  Former  Director  of  Occupational  Health  Sc  Performance 
U.S.  Army  Research  Institute  of  Environmental  Medicine 
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.xesutive  Summar 


This  State-of-the-Art  Report  (SOAR)  documents  the  methods,  processes,  and  issues  that  are 
involved  with  the  development  of  physical  fitness  standards  with  special  reference  to  the  military. 
There  has  been  a  long-standing  interest  in  standards  for  individual  physical  fitness.  On  close  exam¬ 
ination,  this  topic  is  truly  complex.  This  SOAR  strives  to  provide  an  organized  base  of  knowledge 
concerning  the  primary  factors  of  standards  development  as  they  might  apply  to  both  the  military 
and  civilian  environments.  Chapters,  each  written  by  a  military  or  civilian  subject-matter  expert, 
focus  on  history  of  occupational  demands  assessment,  health-based  fitness  standards.job  analysis, 
types  of  physical  fitness  tests,  test  validity,  setting  performance  standards,  and  legal  issues.  This 
review  is  unique  in  both  its  scope  and  timeliness  of  information. 

Knowing  how  fit  personnel  should  be  is  one  focus  of  the  field  of  occupational  demands  meas¬ 
urement — a  field  that  has  its  roots  in  the  fields  of  industrial  engineering  and  occupational  assess¬ 
ment,  individual  differences,  and  physical  fitness.  Military  fitness,  or  the  lack  of  it,  has  been  an  issue 
for  as  long  as  militia  have  existed.The  physical  fitness  programs  of  the  military  services  now  employ 
norm-based  standards — those  sometimes  representative  of  past  military  populations.  In  terms  of 
occupational  fitness  standards,  only  the  Air  Force  currently  employs  such  an  occupational  fitness 
test,  the  Strength  Aptitude  Test,  which  is  administered  at  accession.  Although  military  researchers 
have  vigorously  studied  this  topic,  few  steps  have  been  implemented  by  the  services  to  update  and 
improve  their  approach  to  the  fitness  standards  process,  particularly  with  regard  to  occupational  and 
health  considerations.  Further  policy  and  programmatic  changes  will  be  necessary  as  the  military 
services  move  into  occupationally  based  fitness  standards.  On  the  other  hand,  industry  and  other 
organizations  have  progressed  relatively  further  here,  albeit  rigorous  legal  constraints  and  challenges. 

A  more  generic  approach  to  standards  development  is  to  establish  health-based  fitness  levels. 
The  scientific  literature  is  replete  with  publications  supporting  the  strong  association  between 
physical  activity/fitness  and  general  heath,  wellness  and  quality  of  life  (detailed  in  Appendix  A). 
Increased  life  span,  enhanced  quality  of  life,  and  reduced  morbidity/mortality  result  from  an  active 
and  fit  lifestyle.  From  a  military  standpoint,  “generalized'Titness  standards  promote  physical  readi¬ 
ness  commensurate  with  an  active  life  style,  decreased  risk  of  injury  and  the  deployability  of  the 
military  profession.  Health-based  standards  are  therefore  forwarded  here  as  an  adjunctive  or  base¬ 
line  approach  to  the  typical  (performance-based)  development  process.  One  must  first  attempt  to 
quantify  the  relationships  between  the  exercise  regimen  (or  Rx),  the  measured  level  of  fitness,  and 
health  benefit  outcome.  While  the  rationale  of  this  approach  is  most  clear,  in  some  cases  pragmat¬ 
ic  barriers  are  evident  and,  for  example,  include  insufficient  data  to  identify  clearly  defined  cut- 
points  on  which  to  base  specific  standards.  Basically  this  stems  from  the  difficulty  in  identifying 
minimal  dose-response  (versus  adequate  or  optimal )  relationships  across  the  varying  exercise  modal¬ 
ities.  Nevertheless,  this  should  not  deter  further  developmental  efforts  to  apply  or  investigate  alter¬ 
native  procedures  for  health-based  (or  baseline  )  fitness  standards  approaches. 
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Establishing  the  more  classicakjob-specific  fitness  standards  is  a  process  that  should  begin  with 
a  job  analysis  or  Physical  Demands  Analysis,  to  describe  and  quantify  those  aspects  of  physical  fit¬ 
ness  or  physical  performance  that  are  relevant  to  job  performance.  A  number  of  techniques  are 
available  to  identify  the  most  physically  demanding  tasks  using  some  industrial/organizational  psy¬ 
chology  tools,  and  to  quantify  the  stress  and  strain  associated  with  these  tasks  using  physiological, 
biomechanical  and  psychophysical  approaches.  Each  technique  has  its  own  strengths  and  limita¬ 
tions,  and  it  is  the  responsibility  of  the  investigator  along  with  a  host  of  other  considerations, 
including  resource  availability,  to  develop  the  approach  which  is  most  likely  to  elicit  a  complete  and 
balanced  output.  Overall,  conducting  ajob  analysis  is  a  confound  process  that  requires  considerable 
investment.  The  best  science  and  good  judgement  are  required  in  equal  measure.  The  output  of  a 
conscientious  Physical  Demands  Analysis  should  provide  a  sound  foundation  for  establishing 
occupational  fitness  standards,  focusing  physical  training  programs,  identifying  health  and  safety 
issues  and  prioritizing  those  tasks  that  require  job  redesign.  The  long-term  benefit  to  the  employ¬ 
er  of  implementing  these  strategies  will  be  increased  productivity  through  improved  operational 
effectiveness  and  reduced  injury. 

Employing  appropriate  types  of  tests  to  evaluate  an  individual’s  ability  to  do  specific,  physically 
demanding  work  is  a  second  essential  step  in  developing  rational  fitness  standards. There  are  two  gen¬ 
eral  types,  basic  ability  and  work  sample  tests.  The  basic  ability  tests  (field  or  laboratory)  include  those 
that  assess  aerobic  fitness,  body  composition,  muscle  strength,  and  muscle  endurance  (flexibility,  agili¬ 
ty  and  balance  are  normally  physical  factors  of  less  importance).  A  work-sample  test  is  designed  to 
duplicate  the  occupational  task.  Whereas,  basic  ability  tests  are  of  lower  order  or  more  generic  and 
normally  require  less  skill  or  coordination.  Research  documents  that  physically  demanding  work- 
sample  test  performance  is  largely  dependent  on  aerobic  fitness,  body  composition  and  strength  to 
varying  degrees.  When  basic  ability  tests  are  highly  correlated  with  work  sample  tests,  one  test  admin¬ 
istration  option  is  to  replace  the  work  sample  test  with  a  basic  ability  test  or  combination  of  basic  abil¬ 
ity  tests.  This  is  especially  desirable  when  approaching  testing  for  job  selection  purposes. 

The  sanctioned  test  validation  methods  are  content  validity,  criterion-related  validity,  and  con¬ 
struct  validity.  At  the  risk  of  over  simplification,  one  might  relate  these  methods  with  work  sample 
tests,  basic  ability  tests  and  a  weighted  combination  of  basic  ability  tests,  respectively.  Important  to 
the  issue  of  test  validity  are  the  EEOC  guide-lines  and  the  American  Psycho-logical  Association 
standards  for  validating  educational  and  psychological  tests.  A  major  difference  in  physical  test  val¬ 
idation  is  the  use  of  physiological  rather  then  psychological  tests.  The  goal  ofphysiologicalvalida- 
tion  is  to  define  the  physical  or  physiological  capacity  needed  by  a  worker  to  perform  the  work 
demanded  by  the  task.  Principal  features  of  a  physiological  validation  approach  are  the  use  of 
ergonomic  metrics  to  quantify  test  performance,  along  with  the  interpretation  of  validity  results 
with  relevant  physiological  research  and  theory.  Examples  of  validation  studies  are  also  reviewed 
for:  outside  crafts,  firefighters,  highway  patrolmen,  steel  workers,  coal  mining,  chemicals  produc¬ 
tion,  electrical  lineworkers,  underwater  divers,  military  and  manual  lifting  avocations. 

In  the  employment  setting,  test  scores  may  be  used  to  determine  and  predict  acceptablejob  per¬ 
formance.  Crucial  to  standards  establishment  is  a  rational  methodology  to  establish  passing  test 
scores  that  identify  individuals  who  are  able  to  perform,  or  be  trained  to  perform,  the  essentialjob 
tasks.  These  methodologies  depend  on  the  data  generated  when  content  and  criterion-related 
validity  strategies  are  used  to  identify  legally  defensible  passing  scores.  Effective  criterion  measures 
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assess  and  differentiate  levels  of  job  performance,  and  these  data,  along  with  test  scores  and  valid¬ 
ity  coefficients, are  used  to  formulate  passing  scores  that  identify  successful  and  unsuccessful  can¬ 
didates.  Methods  to  assess  whether  a  passing  score  maximizes  correct  testing  decisions,  while  min¬ 
imizing  testing  errors  include  expectancy  tables,  contingency  tables,  and  Taylor-Russelltables.The 
goal  is  to  maximize  correct  testing  decisions,  while  also  minimizing  testing  error.  Issues  related  to 
the  computation  of  test  fairness  and  utility,  and  adverse  impact,  and  their  integration  with  legal 
considerations  are  also  described.  For  example,  test  fairness  and  adverse  impact  on  specific  popula¬ 
tions  of  workers  are  very  important  considerations  of  physical  testing  standards,  along  with  the 
legal  implications  in  many  employment  contexts.  The  effect  of  basic  physiological  tests  (e.g.,  aero¬ 
bic  capacity,  strength  tests)  and  job  simulations  on  the  reduction  of  adverse  impact  is  also  shown 
using  comparisons  from  a  variety  of  physically  demandingjobs. 

Legal  forces  and  issues  related  to  employment  practices  include  Title  VII  of  the  Civil  Rights 
Act  of  1964,the  Age,  Discrimination  in  Employment  Act  (ADEA)  of  1967, and  Americans  With 
Disabilities  Act  (ADA)  of  1990. The  centerpiece  of  employment  discrimination  law  is  Title  VII  of 
the  Civil  Rights  Act  of  1964,  as  amended  by  Congress  on  several  occasions.  Title  VII  prohibits 
employment  discrimination  because  of  “race,  color,  religion,  sex,  and  national  origin"  by  employers, 
labor  organizations,  and  employment  agencies.  Title  VII  tends  to  be  comprehensive  in  that  every¬ 
one  is  potentially  covered,  because  both  genders  and  all  majority  and  minority  racial  and  ethnic 
groups,  as  well  as  religious  groups,  are  covered  by  Title  VH,  but  the  act  does  not  apply  to  military 
personnel.  The  disparate  impact  theory  is  used  to  establish  employment  discrimination.  This  legal 
process  has  a  three-part  burden  of  proof.  First,  the  plaintiff  (employee)  must  establish  that  the  hir¬ 
ing  practice  has  a  disparate  impact  on  a  protected  group.  Although  not  legally  man-dated,  the 
EEOC  Guidelines  are  often  used  to  define  disparate  impact.  The  Guidelines  use  the  four-fifths 
(4/5s)  rule,  i.e.,  less  than  80%,  of  the  pass  rate  of  the  group  with  the  highest  pass  rate,  to  define 
adverse  impact.  Once  adverse  impact  is  established,  the  burden  of  proof  then  falls  on  the  defendant 
(employer)  tojustify  that  the  exclusionary  effect  is  a  business  necessity.  The  defendant  must  show 
that  the  selection  method  is  job  related.  A  common  method  used  to  establishjob  relatedness  is  with 
a  validation  study.  Lastly,  if  business  necessity  is  established,  the  burden  of  proof  shifts  back  to  the 
plaintiff  to  demonstrate  that  the  employer  failed  to  use  a  selection  device  that  is  equally  effective 
but  has  a  lesser  disparate  impact.  Many  of  these  cases  reviewed  involve  the  use  of  height  and  weight 
standards  and  tests  for  selecting  public  service  employees  such  as  police  officers  and  firefighters. 
The  outcome  of  this  litigation  largely  depended  on  the  scientific  quality  of  validation  study.  The 
recent  court  ruling  ofLanning  v.  SEPTA  (U.S.3rli  Circuit  1999)  will  likely  have  a  major  impact  on 
physical  testing,  and  strongly  suggests  that  validation  studies  will  not  only  be  evaluated  by  standard 
psychometric  criteria,  but  also  physiological  validation  of  the  test  and  cutscore. 
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Chapter  1 


History  of  Occupational  Demands 
Measurement  and  the  Services’ 
Physical  Fitness  Programs 


Barbara  Palmer 

Human  Systems  Information  Analysis  Center 
Wright-Patterson  AFB,  OH 


Abstract 


Knowing  how  fit  Military  personnel  should  be  is  one  focus  of  the  field  of  occupational  demands 
measurement — afield  that  has  roots  in  industrial  engineering  and  occupational  assessment,  indi¬ 
vidual  differences,  and  physical  fitness.This  chapter  shows  how  interest  in  individual  differences 
and  physical  fitness  spans  decades  of  time  and  a  variety  of  focuses,  and  how  the  measurement  of 
individual  differences  in  physical  abilities  has  long  been  intertwined  with  the  needs  of  the  Military. 
The  physical  fitness  programs  of  the  Military  Services  are  documented,  and  an  evaluation  of  the 
fitness  standards  reveals  that  a  norm-based  process  has  been  used  to  establish  most  of  these  require¬ 
ments.  In  the  context  of  occupational  fitness  standards,  only  the  U.S.  Air  Force  employs  such  an 
occupational  fitness  test,  the  Strength  Aptitude  Test,  which  is  administered  at  accession. 


introduction 


Military  fitness,  or  the  lack  of  it,  has  been  an  issue  for  as  long  as  militia  have  existed.  A  low  point 
in  the  state  of  Military  fitness  may  have  occurred  in  the  United  States  during  the  Spanish- American 
War,  when  several  obese  U.S.  Army  generals  were  unable  to  mount  their  horses  at  the  Battle  of  San 
Juan  Hill  in  1898.  At  that  battle,  a  cavalry  charge  was  led  by  a  nephew  of  General  Robert  E.  Lee, 
General  Fitzhugh  Lee,  who  was  forced  to  travel  into  battle  on  the  back  of  a  donkey  (DiDonato, 
2000). 

This  chapter  reviews  significant  milestones  in  the  history  of  individual  differences  and  physical 
fitness.  For  example,  progress  in  the  measurement  of  individual  differences  in  physical  abilities  was 
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prompted  by  the  needs  of  the  Military — especially  the  manpower  requirements  of  World  Wars  I 
and  II.  More  recently,  the  requirement  for  equal  opportunity  in  the  workplace  has  led  to  new  inter¬ 
est  in  establishing  occupational  fitness  standards. 

Establishing  meaningful  fitness  standards  is  one  focus  of  the  field  of  occupational  demands 
measurement — a  field  that  encompasses  knowledge-bases  of  several  disciplines.  In  a  chapter  titled 
Physical  Abilities  in  the  Handbook  of  Industrial  &  Organizational  Psychology,  Hogan  (1991)  states 
that,  “We  have  no  formal  history  from  which  current  efforts  in  physical  abilities  research  evolved. 
Like  the  field  of  industrial  psychology,  roots  can  be  traced  to  aspects  of  industrial  engineering, 
applied  psychology,  and  individual  differences  measurement.  The  common  theme  among  these 
otherwise  diverse  fields  is  that  of  measurement.. .”  (Hogan,  1991a,  p.  755). 

Following  a  brief  history  of  the  contributions  of  the  fields  of  occupational  assessment,  individ¬ 
ual  differences,  and  measurement  of  fitness  and  Military  fitness,  this  chapter  summarizes  the  fit¬ 
ness  programs  of  the  United  States  Military  Services,  and  describes  the  history  and  status  of  job- 
specific  fitness  standards  in  the  Military. 


History  of  Occupational  Demands  Measurement 


This  section  describes  the  major  achievements  of  three  fields  of  endeavor  that  have  coalesced  into 
the  area  of  occupational  demands  measurement — occupational  assessment,  the  study  of  individual 
differences,  and  fitness  measurement  and  Military  fitness  —  during  the  late  1800s  to  the  present. 


Occupational  Assessment 

A  progression  of  quantitative  approaches  to  workplace  activities  has  led  to  today’s  interest  in 
establishing  occupational  fitness  standards.  This  progression  began  when  pioneers  in  industrial 
engineering  made  an  early  impact  on  the  measurement  of  workplace  activities.  Foremost  among 
these  early  researchers  was  Frederick  W.  Taylor  (Taylor,  1923),  who  espoused  the  practice  of  sci¬ 
entific  management.  Still  a  major  tool  for  industrial  engineers,  his  methodology  for  the  study  of 
scientific  management  was  time  study,  which  Taylor  introduced  to  the  Midvale  Steel  Company  in 
1881  .Taylor  employed  a  series  of  investigations  to  determine  what  made  up  a  day’s  work  for  a  “first 
class  man.”  He  timed  the  elements  that  composed  a  task,  then  translated  the  work  done  into  foot¬ 
pounds  or  fractions  of  horsepower.  His  method  required  the  precise  timing  of  each  of  the  smallest 
elements  of  a  task,  a  determination  of  the  quickest  and  most  efficient  motions  to  complete  the 
operation,  and  computation  of  the  amount  of  rest  needed  to  perform  each  task.  After  these  stan¬ 
dard  times  were  determined,  Taylor  was  able  to  form  the  basis  for  training,  performance  measure¬ 
ment,  incentives,  and  compensation. 

Following  Taylor’s  scientific,  quantitative  view  of  the  workplace  were  Lillian  and  Frank 
Gilbreth,  who  developed  a  related  concept,  that  of  motion  study,  in  an  effort  to  determine  the  best 
method  for  bricklaying  (Gilbreth,  1901).  Gilbreth  knew  that  the  various  types  and  numbers  of 
movements  employed  by  various  workers  were  different,  and  were  associated  with  different  levels 
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of  output.  He  defined  motion  study  as  “the  science  of  eliminating  wastefulness  from  using  unnec¬ 
essary,  ill-directed,  and  inefficient  motions. ’’The  principles  of  time  and  motion  study  remain  much 
the  same  since  their  development  by  Taylor  and  Gilbreth. 

The  fact  that  these  principles  have  endured  is  substantiated  and  illustrated  in  the  following 
quote  by  Hogan  (1991a) — 

Today’s  practicing  industrial  engineers  receive  training  in  time  and  motion  methods ,  and  Taylor 
and  Gilbreth’s  procedures form  the  basis  <S  task  analysis  and  personnel  trainingfor jobs  that  require 
skilled  motor  performance.  In  addition,  substantial  research  and  application  has  led  to  the  success- 
fu  1  adoption  <f  industrial  performance  standards,  i.e.,  cut-points.  Using  time  and  motion  analyses, 
standard  performance  times  are  adopted  for  performance  <f  tasks  within  a  system.  Although  per¬ 
formance  standards  serve  many  purposes,  includingperformance  appraisal,  the  standards  provide 
an  answer  to  the  original  and fundamental  question  <f  what  constitutes  a  dayi  work.  (p.  758) 

The  greatest  contribution  to  occupational  assessment  methodology  during  recent  decades  was 
Edwin  Fleishman’s  work  (1964).  His  research  in  human  abilities  and  performance  taxonomies  pro¬ 
vided  a  basis  for  the  evaluation  of  Military  and  educational  physical  fitness  tests.  Fleishman’s  tax¬ 
onomies  were  composed  of  attributes  characterized  as  relatively  unchanging  “enduring  traits” 
(1964,  p.12).  He  conducted  the  first  systematic  research  designed  to  determine  the  number  of 
attributes  necessary  to  generate  an  adequate  taxonomy  of  physical  performance.  His  nine  basic 
physical  abilities  form  three  classes  important  to  the  structure  of  occupational  fitness  standards: 
muscular  strength,  cardiovascularendurance,  and  factors  affecting  movement  quality. 

Factor  analysis  of  physical  fitness  dimensions  including  strength,  speed,  flexibility,  balance,  and 
coordination  scores,  led  to  recommendations  that  Military  fitness  tests  include  stamina  and  car¬ 
diovascularendurance  factors,  not  previously  included.  A  reanalysis  of  this  work  by  Hogan  (1991b) 
forms  a  classic  study  of  work  requirements  and  occupational  performance. 

Measurement  of  individual  differences 

Parallel  efforts  in  the  field  of  applied  scientific  psychology  during  the  late  1880s  to  the  early 
1900s  were  focused  on  understanding  and  measuring  differences  among  individuals,  a  field  known 
as  psychometrics.  In  the  late  1880s,  Francis  Galton  gathered  data  on  both  physical  and  psychologi¬ 
cal  characteristics  (Thorndike  &.  Hagan,  1969),  being  interested  first  in  techniques  of  precise  meas¬ 
urement  and  secondarily  in  the  development  of  statistical  procedures  with  which  to  make  the  data 
meaningful.  Further  development  of  statistical  tools  was  accomplished  by  Karl  Pearson  and  Charles 
Spearman.  Their  development  of  statistical  techniques  made  possible  the  accurate  analysis  and 
description  of  patterns  of  individual  differences  ( Thorndike  8c  Hagan,  1969).  Heavily  influenced  by 
Galton, James  McKeen  Cattell  furthered  the  research  in  individual  differences  and  wrote  extensive¬ 
ly  about  physical  anthropometry  and  mental  tests  (Hogan,  1991a).  The  focus  of  both  Galton  and 
Cattell’ s  work  was  human  development,  with  the  hope  that  assessment  of  individual  traits  could  be 
used  in  educational  and  vocational  arenas.  Although  two  research  studies  (Sharp,  1898;  Wissler, 
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1901)  criticized  Cattell’s  work  and  effectively  ended  his  investigations,  a  new  need  for  assessments 
of  physical  fitness  and  psychological  fitness  arose  as  the  United  States  entered  World  War  I. 

Between  World  Wars  I  and  11, there  was  a  renewed  interest  in  using  tests  and  measurements 
for  scientific  and  instructional  applications  (Hogan,  1991a).  E.L.  Thorndike,  a  student  of  Cattell’s 
before  the  turn  of  the  century,  influenced  the  development  of  standardized  educational  tests.  Alfred 
Binet  and  Lewis  Ternan  produced  several  versions  of  intelligence  tests,  while  motor  testing  was  the 
focus  of  David  K.  Brace  and  Frederick  Rand  Rogers.  Drawing  on  the  foundation  created  by  these 
leaders,  as  well  as  that  established  by  Clifford  Lee  Brownell,  Harvey  Lehman,  and  Paul  Andrew 
Witty,  the  decade  before  World  War  II  experienced  an  intensive  increase  in  scientific  exploration 
and  documentation  that  underlies  much  of  the  current  practice  (Hogan,  1991a). 

Physical  Fitness  Measurement  and  Military  Fitness 

The  state  of  physical  fitness  assessment  in  the  1890s  had  consisted  largely  of  three  areas  of 
measurement — anthropometry;  muscular  strength,  endurance,  and  power;  and  cardiovascular  fit¬ 
ness  (Maud,  1995).  Early  sets  of  anthropometric  measurements  (Hitchcock  &  Seelye,  1893) 
included  age,  height,  and  weight;  and  chest,  arm,  and  forearm  girth.  Seaver  (1890)  is  considered 
the  first  modern  author  of  fitness  anthropometry.  The  second  area  of  fitness  measurement  of  this 
era  was  headed  by  Dudley  Sargent,  who  is  considered  the  father  of  modern  strength  testing, 
(Hartwell,  1885).  Sargent’s  1921  article,  “The  Physical  Test  of  Man,”  described  the  verticaljump 
test,  one  of  the  first  tests  of  muscular  power.  Kellogg  (1896),  who  described  the  universal 
dynanometer  that  made  this  type  of  testing  more  accurate,  conducted  further  work  in  the  area  of 
strength  testing.  Cardiovascular  fitness  testing  in  the  1900s  was  conducted  with  the  Foster  Cardio¬ 
vascular  Test,  which  correlated  the  response  rate  of  the  heart  to  exercise  and  recovery  (Foster, 
1914). The  Barach  Cardio-VascularTest  evaluated  blood  pressure  and  heart  rate  to  determine  car¬ 
diovascular  function  (Barach,  1919).  Eventually,  Schneider’s  (1920)  more  standardized  combina¬ 
tion  of  pulse  rate  and  blood  pressure  assessment  was  used,  taken  with  the  subject  both  horizontal 
and  standing,  before  and  after  exercise. 

Although  the  methodologies  for  measurement  were  becoming  more  precise,  the  levels  of  fitness 
in  the  Military  were  not  encouraging  during  this  era,  as  illustrated  at  the  beginning  of  this  chap¬ 
ter.  This  low  point  in  Military  fitness  levels  spawned  the  need  for  assessments  of  physical  and  psy¬ 
chological  fitness  in  the  Military  as  the  first  World  War  loomed.  The  Commission  on  Training 
Camp  Activities  was  formed  to  develop  the  physical  capacities  of  World  War  I  recruits,  one  third 
of  whom  were  found  physically  unfit  for  duty,  according  to  data  cited  by  Bucher  (1968).  The  situ¬ 
ation  was  not  much  better  on  the  civilian  front,  with  a  1918  study  of  industrial  employees  showing 
that  270  million  workdays  were  lost  due  to  “loss  of  health  and  vigor”  ( Hackensmith,  1966,  p.  412). 
These  conditions  led  to  legislation  encouraging  health  and  physical  education  in  schools. 

As  the  measurement  of  physical  abilities  began  to  improve,  data  were  used  to  develop  popula¬ 
tion  norms.  Rogers  (1927)  developed  the  Strength  Index  and  the  Physical  Fitness  Index,  which 
consisted  of  seven  measures,  including  hand,  shoulder  girdle,  back,  leg  and  arm  strength,  and  forced 
vital  capacity.  The  index  was  derived  by  comparison  to  norms  based  on  gender,  age,  and  weight;  and 
its  use  led  to  great  popularity  of  strength  testing.  Through  the  years,  modifications  to  this  test  have 
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included  the  elimination  of  pull-up  and  dip  tests,  which  were  replaced  with  static  tests  of  arm 
strength  using  the  back/leg  dynamometer  (MacCurdy,  1933).  Dynamometer  testing  of  isometric 
strength  is  still  used,  but  the  more  popular  method  is  to  assess  maximum  weight  moved  in  one  rep¬ 
etition  (1RM).  These  indexes  and  tests  measured  the  same  three  areas  of  physical  abilities  that  were 
measured  in  the  1890s,  but  were  updated  and  refined  for  the  time  span  of  the  1920s. 

Cardiovascular  fitness  assessment  of  the  early  1920s  was  characterized  by  Schneider’s  (1920) 
evaluation  of  World  War  I  aviators.  He  combined  pulse  rate  and  blood  pressure  from  subjects  in  the 
standing  and  horizontal  positions,  and  assessed  pulse  rate  after  15  seconds  of  bench  stepping  and 
during  recovery.  This  method  was  followed  by  Tuttle’s  Pulse  Ratio  Test  (1931),  which  assessed  pulse 
rate  before  and  after  a  prescribed  period  of  bench-stepping  activity.  Currently,  the  Harvard  Step  Test 
(Brouha,  1943)  is  still  used,  or  alternatively  and  perhaps  more  frequently, heart  rates  taken  during 
and  after  exercise  on  a  treadmill  or  a  bicycle  ergometer  are  used.  The  criterion  measure  for  VC^max 
is  a  maximal  test  that  measures  oxygen  uptake,  which  can  be  done  using  sophisticated  instrumenta¬ 
tion  and  normally  accomplished  on  a  cycle  ergometer  or  a  treadmill. 

Results  of  fitness  tests  made  the  lack  of  readiness  of  recruits  entering  World  War  I  obvious.  In 
response,  the  U.S.  Army  and  U.S.  Air  Force  set  up  a  physical  training  course  under  baseball  Hall-of- 
Famer  Hank  Greenburg;  and  the  U.S.  Navy  set  up  a  program  under  heavyweight  boxing  champion 
Gene  Tunney.  In  1942,  the  Office  of  Defense,  Health,  and  Welfare  Services  developed  a  Division  of 
Physical  Fitness;  and  the  Federal  Security  Agency  established  a  committee  on  physical  fitness  in 
1943. Because  World  War  II  statistics  revealed  that  900,000  out  of  2  million  volunteers  were  reject¬ 
ed  because  of  “mental  or  physical  defects”  (Hogan,  1991),  the  necessity  of  fitness  for  all  age  groups 
became  prevalent  in  the  United  States.  The  Minimum  Muscular  Fitness  Test  in  School  Children 
(Kraus  &.  Hirshland,  1954)indicated  that  U.S.  youth  were  less  fit  than  their  European  counterparts. 
These  results  led  to  President  Eisenhower’s  Presidential  Conference  on  Fitness  of  American  Youth 
in  1955, followed  by  the  establishment  of  the  President’s  Council  on  Youth  Fitness,  which  contin¬ 
ues  to  this  day. 

Continued  concern  for  Military  readiness  occurred  during  the  Korean  Conflict  and  the  Cold 
War.  The  concern  about  Military  readiness  was  balanced  with  concern  about  the  fitness  of  the 
American  public,  especially  the  youth.  School  children’s  fitness  was  assessed  via  the  American 
Association  of  Health,  Physical  Education,  and  Recreation  National  Fitness  Tests  (AAHPER), 
which  President  Kennedy  expanded  into  a  fitness  development  program  (Hunsacker,  1958). 

In  the  1960son,  there  was  a  considerable  push  to  increase  the  fitness  levels  of  the  American  pub¬ 
lic.  President  Jimmy  Carter  commissioned  a  study  to  document  the  state  of  Military  fitness  in  1983 
which  led  to  body  fat  assessment  programs  being  established  by  the  Services  (Institute  of  Medicine, 
1 998). During  this  decade  and  the  next,  increases  in  physical  fitness  testing  and  standardization  were 
seen  on  many  fronts.  The  American  College  of  Sports  Medicine  led  the  way  in  making  fitness  a 
quantifiable,  scientific  study  and  published  Guidelinesfor  Exercise  Testing  and  Prescription  in  1991  and 
1995. The  YMCA  Test  Battery,  the  Canadian  Standardized  Test  of  Fitness  (CSTF),  the  AAH- 
PERDTest  Battery,  FITNESSGRAM,  and  the  EurofitTest  Battery  for  Adults  are  all  widely  used 
tests.  Norm-referenced  standards  were  documented  for  several  populations,  including  the  Military. 
The  Military  standards  that  were  derived  during  this  era  are  described  in  the  following  section,  and 
research  efforts  to  increase  ties  to  scientific  findings  over  time  are  discussed. 
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Physical  and  Occupational  Fitness  Programs  of  the  U.S 
Military  Services 


Physical  Fitness  Programs 

Fitness  standards  in  the  Military  have  been  established  over  the  years  to  promote  health  and 
general  physical  fitness.  General  ( or  health-related  physical  fitness)  standards  are  applicable  to  all 
Service  members  regardless  of  their  occupation.  These  standards  are  not  intended  to  specifically 
enhance  the  performance  of  a  particular  Service  mission  or  job  (see  Chapter  2). 

DoD’s  guidance,  through  Department  of  Defense  Directive  1308.1,  requires  that  the  Services 
establish  a  physical  fitness  and  body  fat  program  that  includes  fitness  requirements  for  all  Service 
members  (U.S.  Department  of  Defense,  1995). This  guidance  requires  that  regardless  of  age,  all 
Service  members  be  measured  along  three  dimensions  annually:  cardiovascular  endurance  (meas¬ 
ured  by  activities  such  as  running  a  certain  distance  within  a  specified  time  limit);  muscular  strength 
and  endurance  (measured  by  activities  such  as  sit-ups  and  push-ups);  and  maintenance  of  body  fat 
within  a  certain  percentage  range.  The  guidance  does  not  specify  particular  testing  activities  or 
minimum  required  levels  of  difficulty.  Each  Military  Service  is  required  to  design  its  own  fitness 
program  to  meet  mission-specific  needs  (U.S.  Department  of  Defense,  1995). 

Department  of  Defense  Instruction  1308.3,  Physical  Fitness  and  Body  Fat  Programs 
Procedures  (U.S.  Department  of  Defense,  1995)  applies  to  the  Office  of  the  Secretary  of  Defense, 
the  Military  Departments,  the  Chairman  of  the  Joint  Chiefs  of  Staff,  and  the  Unified  Combatant 
Commands.  This  policy  states — 

It  is  DoD  policy  that  physical fitness  is  essential  to  combat  readiness  and  is  an  important  part  of 
the  general  health  and  well-beingfor  Armed  Forces  personnel'.  Individual  Service  members  must 
possess  the  cardio-respiratory  endurance,  muscular  strength  and  endurance,  and  whole-body  flexi¬ 
bility  t  o  successfully  perform  in  accordance  with  their  ser\’ice-spec+  mission,  and  Military  spe¬ 
cialty.  Those  qualities,  as  well  as  balance,  agility,  and  explosive  power,  togetherwith  levels  of  body 
compositionform  the  basis  of  the  DoD  Physical  Fitness  and  Body  Fat  Program  (p.  1 ). 

This  same  Instruction  defines  fitness  as — 

The  ability  of  Sendee  members  to  meet  the  physical  demands  of  theirjobsfor  an  extendedperiod  of 
time  and  to  have  the  additional  ability  of  meeting  physical  emergencies,  such  as  those  imposed  dur¬ 
ing  combat  or  other  stressful  situations. 

To  help  illustrate  the  qualities  and  the  physical  fitness  guidelines  outlined  in  the  directive  above, 
Table  l.lcompares  the  current  fitness  requirements  of  the  U.S.  Army,  U.S.  Air  Force,  U.S.  Navy, 
and  U.S.  Marine  Corps.  Table  1.2  outlines  the  standards  of  the  fitness  tests  used  by  the  Services. 
Following  this.  Table  1.3  compares  the  body  composition  standards  of  the  four  Services,  which 
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serve  as  a  weight- for-height  screen,  and  Table  1.4  indicates  the  components  taken  into  account 
when  the  Services  use  circumference  measurements  to  estimate  body  composition. 


Table  1. 1  Comparison  of  basic  characteristics  of  the  physical  programs  of  the  four  services 


Army 

Air  Force 

Navy 

Marine  Corps 

Reference 

Regulation  350-41 , 600-9,  and  600-63,  FM  21-20 

Instruction  40-501  and  40-502 

instruction61 1.1F 

Order  6100.1c 

Objective/Goal 

Combat  and  operational  readiness 

Healthy  life  style 

Military  appearance 

Motivationto  train 

Fit  and  healthyforce 

Optimal  health 

Stamina  for  optimal  readiness 

Overall  fitness 

Mission/combat  readiness 

Components 

Aerobic  capacity 

Upperbody/trank  strength/endurance 

Body  fat 

‘Aerobic  capacity 
Upperbody/trank  strength/ 

endurance 

Body  fat 

Aerobic  capacity 
Upperbody/trank  strength/ 

endurance 

Flexibility 

Body  fat 

Aerobic  capacity 
Upperbody/trank 
strength/endurance 

Body  fat 

Test  Items 

2-mile  run 

Push-up 

sit-up 

Body  fat  by  tape 

Submax  cycle  ergometer 
prediction  of  VOgmax 

Push-up 

Ab  crunch 

Body  fat  by  tape 

1.5-milerun/walk,  or 

500  yard  swim 

Curl-up 

Push-up 

S.  and  reach 

Body  fat  by  tape 

3-mile  run 

Ab  crunch 

Pull-up  (male) 

Flexed  arm  hang  (female: 
Body  fat  by  tape 

Proposed  addition.trial  period  through  December  01. 


Occupational  Standards  Programs 


The  purpose  ofjob-specific  physical  performance  standards  is  to  ensure  that  personnel  assigned 
to  physically  demanding  jobs  can  perform  those  jobs.  Occupational  fitness  standards  indicate  the 
level  at  which  an  individual  must  perform  in  order  to  successfully  meet  job  requirements,  regard¬ 
less  of  body  size  or  gender.  Military  scientists  have  proposed  approaches  to  quantify  various  task 
categories  to  determine  the  feasibility  of  establishing  groups  of  fitness  requirements  that  will  be 
specific  to  job  category. 

Job-specific  performance  in  the  Military  has  been  a  concern  since  the  U.S.  Army  Air  Corps 
Aviation  Psychology  Program  began  in  World  War  II  (Hogan,  1991). During  the  last  three  decades, 
interest  injob-specificperformance  testing  has  increased  with  the  dramatic  influx  ofwomen  into  the 
Military  Services  as  well  as  into  demanding  occupational  specialties.  With  increasing  numbers  of 
women  entering  the  Military  Services,  in  1976,  the  GAO  encouraged  the  DoD  to  develop  physical 
standards  for  job  performance  using  the  Department  of  Labor  system  of  classification  (GAO,  1976, 
as  cited  in  IOM,  1998). The  GAO  Guideline  (Government  Accounting  Office,  1998a)  report  indi¬ 
cates  that  Section  543  of  the  1994  National  Defense  Authorization  Act,  “Required  the  Secretary  of 
Defense  to  prescribe  physical  performance  standards  for  any  occupation  in  which  the  Secretary 
determined  that  strength,  endurance,  and  cardiovascular  capacity  was  essential  to  the  performance 
of  duties”  (p.  8).  This  act  requires  that,  if  developed,  these  standards  will  pertain  tojob  activities  that 
were  commonly  performed  in  that  occupation,  and  must  be  relevant  to  successful  physical  perform¬ 
ance  of  those  tasks,  and  could  not  be  based  on  gender.  “In  other  words,  job-specific  physical  per¬ 
formance  standards  will  identify  the  absolute  minimum  level  needed  for  successful  performance  in 
those  occupations.  Anyone  in  that  occupation,  regardless  of  gender,  will  be  required  to  meet  the  same 
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Table  1.2  Comparison  of  physical  fitness  assessment  standards,  adjusted  for  age  and  gender,  across  the 
four  services  (minimums) 


Army 

Air  Farce 

Navy 

Marine  Corps 

Age 

Male 

Female 

Age  Male  Female 

Age 

Male 

Female 

Age 

Male 

Female 

Aerobic 

2-mile  Run 

Submaximal  CycleErgometry 

1,5-mile  Run/Walk 

3-mile  Run 

Capacity 

(minrsec) 

(ml/kg-min  V02max) 

(mirnsec) 

(min:sec) 

17-21 

15:54 

18:54 

17-19: 

12:30 

15:00 

17-26: 

28  rnin 

31  min 

22-26 

16:36 

19:36 

<24:  27  35 

27-31 

17:00 

20:30 

25-29  27  34 

20-29 

13:30 

15:30 

27-39: 

29  min 

32  min 

32-36 

17:42 

21:42 

30-34:  27  32 

37-41 

18:18 

22:42 

35-39:  26  31 

30-39 

14:30 

16:45 

40-45: 

30  min 

33  rnin 

42-46 

18:42 

23:42 

40-44:  26  30 

47-51 

19:30 

24:00 

45-49:  25  29 

40-49; 

15:30 

17:15 

46+; 

33  min 

36  min 

52-56 

19:48 

24:24 

50-54;  24  28 

57-61 

19:54 

24:48 

55-59:  22  27 

50t: 

16:45 

1730 

62+: 

20:00 

25:00 

Upper  Body 

Push-ups  in  2  minutes 

Push-ups  in  2  minutes 

Pull-ups  (Males) 

Strength/ 

Flexed  Arm  Hang  (Females) 

Endurance 

17-21 

42 

19 

17-19: 

42 

19 

17-26: 

3 

15  sec 

22-26 

40 

17 

Proposed  addition  cf  push-ups, 

27-31 

39 

17 

in  trial  period  through 

20-29: 

37 

16 

27-39: 

3 

15  sec 

32-36 

36 

15 

DecemberOI 

37-41 

34 

13 

30-39: 

31 

11 

40-45: 

3 

15  sec 

42-46 

30 

12 

47-51 

25 

10 

40-49: 

24 

7 

46+: 

3 

15  sec 

52-56 

20 

9 

57-61 

18 

8 

50t: 

19 

2 

62+: 

16 

7 

Abdominal 

Sit-ups  in 2  minutes 

Curl-ups  in  2  minutes 

Ab  crunches  in  2  minutes 

Strength/ 

17-21 

53 

53 

17-19: 

50 

50 

Endumace 

22-26 

50 

50 

Proposed  addition  of  crunches, 

27-31 

45 

45 

in  trial  period  through 

20-29; 

46 

46 

27-39: 

45 

45 

32-36 

42 

42 

DecemberOI 

37-41 

38 

38 

50-39: 

40 

40 

40-45: 

45 

45 

42-46 

32 

32 

47-51 

30 

30 

10-49: 

35 

35 

46+: 

40 

40 

52-56 

28 

28 

57-61 

27 

27 

50t: 

29 

29 

G2+: 

26 

26 

Table  1 .3  Percent  body  fat  standards  for  the  military  services  and  the  U.S.  Coast  Guard 


Military  Branch 

Men 

Age  (Years)  %  Body  Fat 

Age  (Years) 

Women 

%  Body  Fat 

Alt  Force 

if  1*29  :i ' ; 

28 

30+ 

24 

30+ 

32 

if  At#  fJJ 

fillip  : 

|||  11 

. . 

21-27 

22 

21-27 

32 

28-39 

24 

28-39 

34 

40+ 

26 

40+ 

36 

llfI|l|rP||plil| 

. ___ . _ 

40  t 

_ a _ 

_ ^ _ 

* 

y  Marine  Coras  [ ,  if '  '7,  ' «  . 4  *  5 

■ 

Coast  Guard  ,  ' 

■ 

l!!lll!!!Il!ii 

. 

25 

31-39 

35 

i 

40+ 

27 

40+ 

37 
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Table  1.4  Measurement  sites  for  circumferential  taping  for  body  composition  assessment 


Military  Branch 

Men 

Women 

U.S.  Air  Farce 

Abdomen 

Waist 

Neck 

Hip 

Neck 

U.S,  Army 

Abdomen 

Wrist 

Neck 

Neck 

Forearm 

Hip 

U.S.  Navy 

Abdomen 

Waist 

Neck 

Hip 

Neck 

U.S.  Marine  Corps 

Abdomen 

Waist 

Neck 

Hip 

Neck 

standard”  (Government  Accounting  Office,  1998a,  p.  8).  Each  Service  has  categorized  its  occupa¬ 
tional  specialties  according  to  upper  body  strength,  and  the  P-U-L-H-E-S  physical  profile  (physi¬ 
cal  capacity  or  stamina,  use  of  upper  and  lower  extremities,  hearing  acuity,  normal  color  vision  (e), 
and  special  psychiatric  characteristics)  (Institute  of  Medicine,  1998). 

In  1993,  legislation  that  opened  allMOSs  (except  direct  combat)  to  women  led  to  the  GAO 
being  directed  to  reopen  the  issues  of  job-specific  performance  testing.  The  GAO  recommended 
that  the  Services  determine  whether  a  significant  problem  existed  in  the  accomplishment  of  phys¬ 
ically  demanding  occupations,  and  to  identify  ways  to  solve  the  problem.  The  GAO  report 
(Government  Accounting  Office,  1996)  recommended  establishing  valid  performance  standards, 
providing  job  training,  and  redesigning  job  tasks.  Only  the  U.S.  Air  Force  requires  recruits  to  take 
a  strength  test  forjob  assignment  (Institute  OfMedicine,  1998). 

Presently,  the  U.S.  Air  Force  AFSCs  are  categorized  into  eight  physical  demand  categories,  and 
the  U.S,  Air  Force  uses  a  strength  aptitude  test  to  screen  out  those  recruits  who  would  not  be  likely 
to  perform  successfully  on  a  givenjob  (U.S.  Department  of  the  Air  Force,  1994). The  U.S.  Air  Force 
does  not  incorporate  this  fitness  test  into  the  required  annual  fitness  evaluation. The  U.S.  Navy  has 
not  adopted  occupational  strength  standards  for  active  duty  personnel  or  recruits,  nor  has  the  U.S. 
Army.  The  U.S.  Marine  Corps  at  one  time  administered  a  physical  readiness  test  of  combat  skills,  but 
this  has  been  discontinued. 

History  of  U.S.  Army  Fitness  Program 

Physical  Fitness  —  Beginning  in  the  middle  of  the  nineteenth  century,  a  European  style  of  physi¬ 
cal  training  was  used  in  the  U.S.  Army.  This  fitness  program  was  initiated  by  Herman  J.  Koehler, 
who  was  the  “architect  of  modern  U.S.  Army  physical  readiness  training”  (http://192.153.150.25/ 
usapfs/pages/historyl885%2D1920.htm).  He  was  a  master  of  the  GermanTurner  System  and  until 
WW  I,  physical  training  was  under  his  leadership.  Near  the  end  of  this  era,  a  new  method  of  U.S. 
Army  fitness  was  implemented,  following  the  publication  of  the  1914Manual  of  Physical  Training 
for  the  United  States  Army.  Between  this  time  and  the  second  World  War,  competitive  athletics 
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was  used  as  a  supplement  to  conditioning  for  soldiers.  The  emphasis  on  motor  skill  development 
was  aided  through  climbing,  leaping,  and  tumbling.  Many  exercises  were  used  for  developing  coor¬ 
dination  and  these  included  marching,  running,  dumbbell  and  club  exercises,  and  jumping.  Still 
other  skills  such  as  self-reliance  and  confidence  were  emphasized  and  stressed  through  swimming, 
climbing,  boxing,  wrestling,  and  gymnastic  exercises  (http://192.153. 150. 25/usapfs/paijres/historv 
1885%2Dl920.htm'). 

During  the  U.S.  involvement  in  WW II,  the  U.S.  Army  called  on  civilian  and  Military  special¬ 
ists  to  assist  in  the  formation  of  a  modern  physical  fitness  program.  This  would  be  the  first  pro¬ 
gram  of  physical  training  that  could  be  justified  by  accepted  scientific  testing  procedures.  Physical 
fitness  was  encouraged  during  WW  II  training  camp  with  such  expressions  as,  “the  more  sweat,  the 
less  blood”  (http://192.153.150.25/usapfs/pages/historyl885%2D1920.htm-).  The  importance  of 
strength  and  good  health  was  stressed  as  well  as  physical  conditioning  that  helped  the  soldier 
endure  severe  physical  demands.  Training  to  facilitate  realistic  combat  situations  was  conducted 
with  a  soldier's  full  pack,  weapons,  and  ammunition.  The  formal  calisthenics  practiced  during 
WWI  were  now  being  supplemented  with  a  variety  of  difficult  physical  challenges. 

After  WrWII,  the  U.S.  Army  consolidated  physical  training  and  athletics  and  revised  the  U.S. 
Army  physical  fitness  publication  FM  21-20,  which  was  again  revised  in  1985,  1992,  and  1998. 
During  the  1960s,  the  need  for  a  flexible  response  to  a  variety  of  threats  led  to  an  emphasis  on  com¬ 
bat  physical  training.  President  Carter’ s  1 980  call  for  renewed  physical  fitness  efforts  in  the  Military 
resulted  in  a  new  U.S.  Army  fitness  initiative  and  the  creation  of  the  U.S.  Army  Physical  Fitness 
School,  established  at  Fort  Benjamin  Harrison  and  then  moved  to  Fort  Benning  in  1997. The  pro¬ 
gram  now  emphasizes  physical  training  for  the  battlefield,  health,  diet,  and  other  aspects  of  well¬ 
ness  (http://www-benning.army.mil/usapfs/Doctrine/Historv/historvkorea-present.htm'). 

Current  U.S.  Army  Fitness  Program 

Physical  Fitness  —  The  U.S.  Army  conducts  the  U.S.  Army  Physical  Fitness  Test  twice  a  year  and  a 
weight  assessment  once  a  year.  Field  Manual  21—20  directs  the  U.S.  Army  fitness  program  (U.S. 
Department  of  the  U.S.  Army,  1998).  The  U.S.  Army  Physical  Fitness  Test  (Vogel,  1986)  is  admin¬ 
istered  to  all  personnel  through  age  60— individuals  40  years  of  age  and  older  receive  a  pre-test  phys¬ 
ical  and  coronary  risk  assessment.  The  current  set  of  tests  were  chosen  on  the  basis  of  ease  of  admin¬ 
istration,  relative  objectivity  of  scoring,  and  lack  of  needed  equipment  (Vogel,  1986). The  three  events 
include  a  timed  2-mile  run,  the  maximal  number  of  extended-legpush-ups  within  2  minutes,  and  the 
maximal  number  of  bent-knee  sit-ups  within  2  minutes.  Acceptable  substitutions  for  the  2-mile  run 
are  a  6.2-mile  bike  ride,  a  2.5-mile  walk,  or  an  800-yard  swim.  Scoring  differs  by  age  and  sex. 

In  October  1998,  the  U.S.  Army  began  to  implement  new  standards  that  were  again  norm- 
based.  This  was  done  with  the  intent  of  maintaining  a  gender-neutral,  equal-points-for-equal- 
effort  standard.  The  new  minimum  scores  are  based  at  or  near  the  8th  percentile  of  a  sample  of 
scores  collected  during  a  1995  study, with  maximums  set  at  the  90th  percentile.  Standard  scores  are 
reduced  in  5 -year  increments  as  age  increases.  The  new  standards  increase  some  of  the  require¬ 
ments  for  both  sexes,  requiring  women  to  perform  the  same  number  of  sit-ups  as  men.  Female  run 
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times  are  about  14  to  16  percent  slower  than  male  times,  and  female  push-up  requirements  will 
increase  from  44  to  about  50  percent  of  the  male  standards  (U.S.  Department  of  the  Army,  1998). 

As  with  the  other  Services,  the  U.S.  Army  also  has  body  composition  requirements.  Currently, 
the  U.S.  Army  allows  body  fat  of  up  to  20  to  26  percent  for  men  and  between  30  to  36  percent  for 
women,  depending  on  age.  The  lower  figure  for  males  was  derived  from  data  on  young  male  soldiers 
from  a  decade  ago  (GAO,  1998a),  adding  2  percent  body  fat  for  each  decade  after  the  second  yields 
the  upper  figure  of  26  percent.  Up  until  1991,  female  standards  of  28  to  34  percent  were  derived  by 
adding  8  percentage  points  to  the  male  standards.  These  figures  were  determined  to  be  more  restric¬ 
tive  than  the  men’s  when  the  allowable  body  fat  was  compared  to  the  means  of  same-sexrecmits.  In 
1991,  30  and  36  percent  were  established  as  the  lower  and  upper  figures  for  U.S.  Army  women.  The 
equations  incorporating  circumference  measures  used  by  the  U.S.  Army  were  developed  by  Vogel 
and  coworkers  (Vogel,  Kirkpatrick,  Fitzgerald,  Hodgdon,  &  Harman,  1988)  and  include  measure¬ 
ment  of  wrist,  neck,  forearm,  and  hip  for  females,  and  neck  and  abdomen  for  males. 

For  U.S.  Army  enlisted  personnel  and  non-commissioned  officers,  physical  training  each  morn¬ 
ing  is  usually  mandatory.  For  the  most  part,  officers  are  responsible  for  their  own  physical  condi¬ 
tioning  program,  although  whether  this  is  voluntary  or  mandatory  varies  from  post  to  post. 

Occupational  Fitness — As  many  Military  Occupational  Specialties  (MOSs)  opened  to  U.S.  Army 
women  in  the  1970s,  the  U.S.  Army  Research  Institute  for  Environmental  Medicine  (USAFUEM) 
developed  a  battery  of  performance  tests  corresponding  to  capacities  needed  for  the  various  MOSs, 
but  unfortunately  the  tests  were  not  applied.  In  1981,USARIEM  was  again  tasked  with  develop¬ 
ing  and  validating  a  gender-neutral  strength  test  to  be  administered  as  an  entrance  criterion.  The 
resulting  MEPSCAT  (Military  Entrance  Physical  Strength  Capacity  Test)  was  an  incremental 
dynamic  lift  test,  similar  to  the  SAT,  which  predicted  performance  on  job-related  criterion  per¬ 
formance  tasks  (Institute  of  Medicine,  1998).  The  study  that  validated  the  incremental  lift  test  for 
this  application  was  criticized  for  misanalyzing  the  data  and  overstating  the  correlation  between  lift 
performance  and  criterion  test  performance.  The  MEPSCAT  was  administered  until  1990  and  was 
used  to  counsel  recmits  about  requirements  of  various  physical  occupations.  Currently  there  is  no 
U.S.  Army  occupational  fitness  test  in  place  (Institute  of  Medince,  1998). 

History  of  U.S.  Air  Force  Fitness  Program 

Physical  Fitness  —  The  first  U.S.  Air  Force  publication  on  physical  fitness  was  published  in  1947  (U.S. 
Department  of  the  Air  Force,  1947).  Although  no  standard  program  was  documented,  and  no  spe¬ 
cific  level  of  fitness  was  required,  AFR  50-5  stated  that  the  U.S.  Air  Force’ s  fitness  program  should — 

Develop  and  maintain  a  high  level  of  physical fitness  in  the  individualso  that  he  canperform  more 
efficiently  his  assigned  duties 
Encourage  regular  and  healthful  exercise 

’  Foster  an  aggressive  and  cooperative  team  spirit,  increase  the  confidence  d  the  individual,  develop 
sportsmanship,  and  increase  pride  through  participation  in  competitive  athletics,  (cited  in 
Schellhous,  1982,p.  14) 
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Further  guidance  was  included  in  U.S.  Air  Force  Manual  160-26,  Physical’  Conditioning,  pub¬ 
lished  in  1961  (U.S.  Department  of  the  Air  Force,  1961).  The  manual  stated  that  it  was  the  com¬ 
mander’ s  responsibility  to  see  that’his  men  were  developed  physically,  psychologically,  and  socially. 
A  standard  program  was  still  not  established  in  this  improved  and  updated  document.  A  1 959  study 
(Balke  &Ware,  cited  in  Schellhous,  1982)  of  fitness  levels  in  the  U.S,  Air  Force  found  that  the 
overall  state  was  “poor,”and  declared  that  the  US.  Air  Force  physical  fitness  program  was  ineffec¬ 
tive.  AFR  50—5  was  revised  in  1959.  Commanders  were  required  to  establish  a  physical  condition¬ 
ing  program,  establish  weight  limits,  and  prescribe  regular  weekly  exercise.  Still,  no  standard  pro¬ 
gram  or  levels  of  fitness  were  established  (Schellhous,  1982).  The  increased  national  interest  in 
physical  fitness  in  the  early  1960s  led  the  US.  Air  Force  to  adopt  the  Royal  Canadian  Air  Force 
Five  Basic  Exercise  (5BX)  Plan  as  its  official  fitness  program  (U.S.  Department  of  the  Air  Force, 
1962).  The  U.S.  Air  Force  Pamphlet  (AFP)  50-5-1  regulated  the  program  for  men,  and  AFP 
50-5—2  outlined  the  Ten  Basic  Exercise  Plan  (XBX)  for  women.  The  5BX  program  was  designed 
to  progressively  work  the  skeletal  muscles,  heart,  and  lungs  until  a  given  level  of  fitness  was 
achieved.  A  specified  number  of  repetitions  of  each  of  five  exercises  were  to  be  completed  within 
11  minutes.  In  1963,  U.S.  Air  Force  personnel  and  researchers  from  Indiana  University  reported 
that  the  5BX  program  was  fraught  with  problems,  including  a  high  failure  rate,  an  unsatisfactory 
testing  program,  and  a  lack  of  emphasis  on  the  importance  of  physical  fitness  (Schellhous,  1982). 

Dr.  Kenneth  Cooper,  a  U.S.  Air  Force  flight  surgeon  who  is  considered  to  be  the  founder  of  the 
aerobics  movement,  was  behind  the  next  iteration  of  the  U.S.  Air  Force  fitness  program,  in  which  an 
aerobics  test  was  administered  semi-annually  to  all  personnel.  The  test  consisted  of  a  timed  1 .5-mile 
run.  Taking  into  consideration  the  person’s  age  and  the  time  taken  to  finish  the  run,  personnel  were 
put  into  one  of  five  fitness  categories  ranging  from  very  poor  to  excellent  (US.  Department  of  the 
Air  Force,  1977). It  was  suggested  that  concern  about  fatalities  during  testing  led  the  U.S.  Air  Force 
Surgeon  General  to  modify  AFR  35—11  in  1979.  Personnel  age  35  and  over  were  tested  using  a  3- 
mile  walk  rather  than  the  1. 5-mile  run.  This  change  was  unpopular,  and  by  1980,  all  personnel  were 
permitted  to  run  rather  than  walk  for  the  annual  fitness  test.  In  1981,  AFR  35-11  (US.  Department 
of  the  U.S.  Air  Force,  1981)  indicated  that  members  to  be  tested  annually  could  choose  the  1.5-mile 
run,  the  3-mile  walk,  or  stationary  running.  In  March  1982,  the  U.S.  Air  Force  began  assessing  the 
cycle  ergometry  test  to  estimate  work  capacity,  which  led  to  the  current  U.S.  Air  Force  cycle  ergom- 
etry  program  which  was  evaluated  during  a  ten  year  period  and  implemented  in  1992. 

Current  U.S,  Air  Force  Fitness  Program 

Physical  Fitness  —  The  U.S.  Air  Force  Medical  Operations  Agency  established  the  U.S.  Air  Force 
Fitness  Program  Office  in  June  1995. The  U.S.  Air  Force  Fitness  Program  Office  implements,  sus¬ 
tains,  and  supports  the  U.S.  Air  Force  Fitness  Program  (AFFP)  for  all  US.  Air  Force  entities.  The 
U.S.  Air  Force  Surgeon  General’s  office  is  responsible  for  AFFP  policy  and  procedures. 

U.S.  Air  Force  Instruction  40-501  (Medical  Command,  Department  of  the  Air  Force,  1996) 
outlines  the  U.S.  Air  Force  Fitness  Program,  and  states  that  cardiovascular  (aerobic)  fitness  is  the 
single  best  indicator  of  total  physical  fitness.  The  current  cycle  ergometry  test  was  implemented  in 
October  1992,  and  all  U.S.  Air  Force  personnel  are  tested  annually.  Cycle  ergometry  is  used  because 
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it  is  a  reliable  and  safe  estimate  of  cardiovascularfitness  and  is  characterized  by  acceptable  physio¬ 
logical  validity.  It  is  a  submaximal  test,  based  on  the  physiological  principle  that  the  heart  rate 
increases  as  work  intensity  and  oxygen  consumption  increase. The  individual’s  heart  rate  and  work¬ 
load  are  used  to  estimate  VC^max.  As  an  adaptation  to  physical  training,  heart  rate  is  expected  to 
decrease  for  a  given  level  of  work.  This  test  has  demonstrated  its  correlation  with  the  graded  tread¬ 
mill  test  (Pollock  et.  al.,  1994).  The  testing  procedure  basically  assesses  heart  rate  at  the  end  of  a  6 
to  8  minute  steady-state  cycling  period.  Minimum,  steady-state  passing  heart-rate-driven  scores 
are  established  by  sex  and  age.  In  its  current  iteration,  the  test  is  graded  pass  or  fail  only,  with 
respect  to  population  norms.  An  informal  account  of  the  U,S.  Air  Force’s  fitness  history  reveals  that 
the  cardiovascularstandardwas  generally  based  on  performance  statistics  from  a  population  ofU.S. 
Air  Force  men  and  women  in  the  early  1990s  (Government  Accounting  Office,  1998a). 

Following  a  review  of  the  literature  on  the  benefits  of  strength  training  (Palmer  &c  Soest,  1997), 
the  U.S.  Air  Force  is  implementing  a  program  that  incorporates  measures  of  strength/endurance 
(the  ability  to  sustain  submaximal  contraction)  and  encourages  flexibility  activities. The  expanded 
fitness  program  is  being  field  tested  in  the  years  2000  and  2001,  with  pass/fail  standards  to  be 
implemented  by  January  2002. 

The  current  U.S.  Air  Force  fitness  standard  does  not  mandate  duty  time  for  physical  fitness. 
Individual  commanders  are  encouraged  to  require  on-duty  physical  training,  but  for  the  most  part, 
physical  training  is  the  responsibility  of  the  individual. 

In  addition  to  aerobic  fitness,  the  U.S.  Air  Force  also  maintains  minimum  standards  of  body 
composition.  A  history  of  the  U.S.  Air  Force  body  composition  program  was  derived  by  authors  of 
the  1998a  GAO  report  through  discussions  with  officers  who  were  responsible  for  the  program.  U S . 
Air  Force  officials  were  unable  to  provide  specific  studies  or  records  to  document  their  body  fat  stan¬ 
dards.  The  US.  Air  Force  uses  the  same  body  fat  estimation  equations  as  the  U.S.  Navy’s  and  meas¬ 
urements  include  circumference  of  waist  and  neck  for  men  and  abdomen,  neck,  and  hip  for  women. 

Occupational  Fitness  —  Occupational  fitness  within  the  U.S.  Air  Force  is  part  of  the  U.S.  Air  Force 
Personnel  Center’s  classifitcation  program,  and  is  not  part  of  the  U.S.  Air  Force  Fitness  Program, 
has  focused  on  the  Strength  Aptitude  Test  (SAT).  The  SAT  is  a  weight-lifting  test  performed  at 
the  Military  Entrance  Processing  Stations  (MEPSs)  by  all  U.S.  Air  Force  recmits  under  U.S.  Air 
Force  Policy  AFMAN  36-2108  Attachment  39.  The  SAT  was  developed  in  five  phases  between 
1977  and  1982  with  the  goal  of  matching  abilities  of  recruits  with  the  demands  of  individual  jobs. 
The  SAT  was  changed  from  a  test  program  to  a  fully  implemented  program  in  1987  (McDaniel, 
personal  communication,  2000).  The  SAT  uses  an  incremental  lift  machine  that  consists  of  a  ver¬ 
tically  moving  carriage  with  handgrips.  The  weight  to  be  lifted  can  vary  from  40  to  1  lOpounds,  in 
increments  of  lOpounds.  The  handgrips  are  16  inches  apart  and  are  12inches  from  the  floor.  The 
carriage  and  weight  is  lifted  overhead.  Lifts  are  repeated  with  increased  weight  until  the  lift  can¬ 
not  be  safely  completed  at  the  6-foot  level  or  above  the  head. 

During  development  of  the  SAT,  several  strength  tests  were  considered  as  candidates.  An  incre¬ 
mental  lift  test  to  a  height  of  six  feet  was  finally  selected  because  it  proved  to  be  the  best  single  test 
for  U.S.  Air  Force-wide  use.  After  the  test  was  developed  and  implemented,  the  empirical  relation¬ 
ship  between  test  performance  and  performance  on  specificjob  tasks  was  documented.  The  next 
phase  was  to  survey  the  physical  demands  of  every  career  field  in  the  U.S.  Air  Force.  This  ambitious 
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project  measured  the  physically  demanding  components,  such  as  lifting  and  pushing  of  eachjob.  The 
U.S.  Air  Force  adopted  strength  requirements  for  all  enlisted  U.S.  Air  Force  Specialties  (AFSs)  in 
1988.  The  make-up  and  the  duties  of  an  AFS  do  not  remain  constant.  Since  the  original  survey  of 
physical  demands  during  1978  to  1982,  some  AFSs  have  been  subdivided  and  some  combined  to  cre¬ 
ate  new  ones.  In  some  cases,  the  systems  and  equipment  used  by  the  members  of  an  AFS  have 
changed.  When  changes  in  thejob  result  in  changes  in  the  physical  demands  of  the  job,  the  SAT  cri¬ 
terion  may  become  obsolete.  If  this  occurs,  the  physical  demands  must  be  reanalyzed  to  determine  if 
the  SAT  criteria  should  be  changed.  Each  year,  it  is  the  goal  of  the  U.S.  Air  Force  Research 
Laboratory’s  (AFRL)  Human  Engineering  Directorate  to  resurvey  several  AFSs  and  compute  new 
SAT  criteria  for  them. 

History  of  U.S.  Navy  Fitness  Program 

Physical  Fitness  —  The  history  of  the  U.S.  Navy’s  fitness  program  is  outlined  in  a  technical  report 
by  Hodgdon  (1999),  who  states  that  the  first  Naval  Operations  Instruction  61 10.1  (Chief  of  Naval 
Operations,  1976)  emphasized  cardiovascularfitness,  based  on  the  popular  aerobics  program  of  Dr. 
Kenneth  Cooper.  The  next  version  of  this  instruction.  Naval  Operations  Instruction  6110, 1A 
(Chief  of  Naval  Operations,  1980),  was  the  first  to  include  a  physical  fitness  test  to  allow  assess¬ 
ment  of  each  individual’s  level  of  fitness.  Issued  in  1980,  it  followed  President  Jimmy  Carter’s 
request  for  such  an  assessment  among  all  the  Military  Services. 

Naval  Operations  Instruction  6110. IB  (Chief  of  Naval  Operations,  1982)was  issued  to  imple¬ 
ment  policies  as  directed  by  the  Department  of  Defense  in  its  Directive  1308.1  (U.S.  Department 
ofDefense,  1981). This  instruction  included  physical  fitness,  weight  control,  and  health  promotion 
considerations,  and  outlined  the  Physical  Readiness  Test  (PRT).  The  PRT  consisted  of  a  1.5-mile 
run,  or  the  number  of  steps-in-place  that  could  be  done  in  3  minutes,  the  number  of  curl-ups  that 
could  be  done  in  2  minutes  and  measurement  of  the  sit-and-reach  flexibility  range.  In  addition,  it 
appointed  Command  Fitness  Coordinators  to  manage  the  Commanding  Officers  for  program 
implementation.  Consequences  for  failing  the  PRT  were  documented,  and  two  new  items  were 
added  to  the  PRT — the  number  of  push-ups  that  could  be  performed  in  2  minutes  and  the  time 
required  to  swim  500  yards  or  450  meters  (an  alternative  to  the  L/z-mile  run).  Ability  to  run  in 
place  was  dropped  as  a  test  item,  and  the  sit-reach  was  made  a  pass-fail  item. 

The  next  instruction.  Naval  Operations  Instruction  6110, 1C  (Chief  of  Naval  Operations, 
1986),  was  limited  to  physical  fitness  and  body  fat  standards,  with  health  promotion  being  covered 
under  a  separate  Instruction.  Body  fat  content  measurement  was  included  with  this  version  of  the 
fitness  Instruction,  and  a  new  technique  for  estimating  body  fat  content  was  detailed.  Also,  new 
body  fat  standards  were  adopted.  This  figure  was  derived  through  analysis  of  several  scientific  stud¬ 
ies  by  the  Naval  Health  Research  Center.  The  1985  National  Institutes  of  Health  (NIH)  defini¬ 
tion  of  obesity  has  been  used  as  an  upper  limit  for  males,  with  a  conversion  of  the  1983 
Metropolitan  Life  weight-for-height  values  into  mean  body  fat  percentages  of  22  percent  for  males 
and  33  percent  for  females  (Metropolitan  Life  Insurance  Company,  1983).  These  figures  were  rec¬ 
ommended  as  U.S.  Navy  maximums  for  body  fat  (National  Institutes  of  Health,  1985).  The  rec¬ 
ommendation  for  men  was  accepted,  but  command  concerns  about  physical  appearance  dropped 
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the  female  standard  to  30  percent.  The  U.S.  Navy’s  equations  were  developed  by  Hodgdon  and 
Beckett  (1984a,  b)  at  the  Naval  Health  Research  Center.  Circumferences  used  for  males  include 
that  of  the  abdomen  and  neck,  and  for  females,  neck,  waist,  and  hip. 

The  1990Naval  Instruction,  6110. ID  (Chief  of  Naval  Operations,  1990)  made  participation  in 
the  PRT  for  personnel  more  than  50  years  old  optional.  Body  fat  standards  were  modified  so  that 
the  limit  for  males  was  22  percent  and  the  limit  for  females  was  30  percent.  The  weight-for-height 
table  was  introduced  as  an  initial  screening  device  for  body  fat  evaluation. 

In  1998,  the  female  standard  was  unexpectedly  raised  back  to  the  originally  recommended  33 
percent.  The  next  Instruction,  6110. IE  (Chief  of  Naval  Operations,  1998),  was  released  in  March 
1998  and  contained  revised  weight-for-height  tables  and  a  body  fat  limit  of  33  percent  for  women. 
Minor  changes  were  made  to  the  PRT  testing  protocol.  Reports  by  the  Government  Accounting 
Office  (1998a,  b)  and  the  Institute  of  Medicine  (1998)  recommended  changes  to  the  Military 
Services’ fitness  plans,  and  these  suggestions  led  to  the  most  recent  instruction,  6110. IF  (Chief  of 
Naval  Operations,  2000),  the  basis  of  the  U.S.  Navy’s  current  fitness  program. 

Current  U.S.  Navy  Fitness  Program 

Physical  Fitness  —  The  U.S.  Navy’s  new  fitness  program  is  governed  by  Naval  Operations 
Instruction  6110. IF  (Chief  of  Naval  Operations,  2000)  and  consists  of  a  biannual  Physical  Fitness 
Assessment  (PFA).The  PFA  is  a  goal-oriented,  total  health,  physical  fitness  and  readiness  program 
that  has  three  components — Physical  Activity  Risk  Factor  Screening,  body  composition  assess¬ 
ment,  and  the  Physical  Readiness  Test  (PRT)  (Chief  of  Naval  Operations,  2000).  The  PRT  con¬ 
sists  of  the  sit-reach,  push-ups,  curl-ups,  and  run  or  swim.  This  new  instruction  changes  the  U.S. 
Navy  fitness  test  from  a  score-based  to  a  goal-oriented  program,  with  separate  standards  for  each 
gender  within  age  groups.  All  personnel  are  tested,  with  no  upper  age  limit.  There  are  five 
Performance  Categories  (Outstanding,  Excellent,  Good,  Satisfactory,  and  Unsatisfactory)  and 
three  performance  levels  within  these  categories  (High,  Medium,  and  Low  for  Outstanding, 
Excellent,  and  Good  Categories;  High,  Medium,  and  Marginal  for  the  Satisfactory  Category). 
Standards  were  determined  on  the  basis  of  PRT  scores  gathered  during  1997  and  1998. 

The  new  Instruction  encourages  the  Morale,  Welfare,  and  Recreation  (MWR)  fitness  staff  to 
provide  one-on-one  exercise  prescriptions  to  individuals  needing  assistance  in  attaining  and  main¬ 
taining  their  fitness  level  as  part  of  the  U.S.  Navy’s  PFA. 

In  accordance  with  DoD  guidance,  the  US.  Navy  also  maintains  body  composition  standards 
in  addition  to  its  physical  fitness  and  weight  requirements.  Males  between  the  ages  of  17  and  39 
can  have  a  maximum  of  22  percent  body  fat,  while  females  in  this  age  group  can  have  a  maximum 
33  percent  body  fat.  For  personnel  aged  40  and  over,  the  limits  are  23  percent  (males)  and  34  per¬ 
cent  (females)  (Chief  of  Naval  Operations,  2000). 

Occupational  Fitness  —  In  the  early  1950s,  the  U.S.  Navy  started  a  project  whose  goal  was  to  deter¬ 
mine  minimum  physical  specifications  for  U.S.  Navyjobs  (Browne  &  Germain,  1952).  A  literature 
search  was  conducted,  and  a  data-gathering  methodology  was  established.  Demands  of  several 
career  fields  were  documented.  In  addition,  Robertson  (1992)  published  the  results  of  aU.S.  Navy 
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project  for  developingjob  performance  standards  of  muscularly  demanding  aircraft  and  shipboard 
tasks.  The  project  was  completed  in  five  phases.  All  muscularly  demanding  aircraft  carrier  and  ship¬ 
board  tasks  were  identified,  criterion  tasks  and  performance  standards  were  determined,  an  instru¬ 
ment  was  developed  to  measure  criterion  task  performance,  a  strength  test  battery  was  construct¬ 
ed,  and  the  strength  test  battery  was  validated  regarding  its  ability  to  predict  criterion  task  per¬ 
formance.  Robertson  (1992)  notes  that,  “Although  these  methods  to  set  standards  have  been 
demonstrated,  they  have  not  been  implemented.  Perhaps  it  is  because  (1)  costs  of  injuries  or  non¬ 
performance  have  not  been  adequately  demonstrated,  or  (2)  competing  concerns  for  selection  sole¬ 
ly  on  technical  abilities  have  predominated”  (p.  1301). 

The  U.S.  Navy  considered  using  a  strength  test  as  a  screen  for  applicants  desiring  entry  into 
physically  demanding  Military  fields  and  concluded  that  more  women  than  men  would  be  exclud¬ 
ed  as  a  result  (Government  Accounting  Office,  1998).  Since  U.S.  Navy  women  were  meeting  the 
demands  of  their  occupations,  it  was  not  deemed  necessary  to  either  implement  such  a  test  or  cat¬ 
egorize  career  fields  by  physical  demand.  While  the  U.S.  Navy  has  no  specific  fitness  standards  for 
various  occupations,  it  has  done  considerable  research  in  this  field.  A  study  by  Vickers,  Hervig,  and 
White  (1997)  examined  the  relationship  between  Physical  Demand  Ratingss  (PDRs)  and  back 
injury  hospitalization  rates  (BIRs)  for  73  entry-level  U.S.  Navy  occupations.  The  study  demon¬ 
strated  a  strong  relationship  between  PDRs  and  BIRs.  Applying  the  resulting  guadratic  function 
to  define  an  exceptionally  demanding  job,  44  percent  of  the  73  occupations  studied  would  require 
occupation-specific  fitness  standards. 

Job-relevant  training  programs  are  being  developed  by  the  Naval  Health  Research  Center  in 
response  to  the  DoD  directive,  1308.1  (U.S.  Department  of  Defense,  1981),  which  ordered  each 
Military  Service  to  develop  training  programs  to  meet  the  specific  task  requirements  of  their  per¬ 
sonnel.  One  such  total  body  fitness  program  is  SPARTEN  (Scientific  Program  of  Aerobic  and 
Resistance  Training  Exercise  in  the  U.S.  Navy).  SPARTEN  is  an  on-ship  program  based  on 
research  findings  that  indicate  that  aerobic  and  circuit  training  is  superior  to  aerobic  and  calisthenic 
conditioning  for  developing  total  body  fitness.  It  offers  aerobic  training  to  maintain  health  and  pro¬ 
gressive  resistance  training,  which  optimizes  job  performance  and  minimizes  job-related  injuries 
(Marcinik,  1984).  The  movements  attempted  to  simulate  efforts  such  as  lifting,  pushing,  and 
pulling,  which  are  performed  during  the  performance  of  muscularly  demanding  shipboard  work. 
The  circuit  weight  training  exercises  are  performed  on  a  multistation  machine  and  develop  all 
major  muscle  groups  with  minimal  rest  periods  to  further  develop  cardiovascular  endurance. 

History  o!  U.S.  Marine  Corps  Fitness  Program 

PhysicalFitness — U.S.  Marine  Corps  male  standards  were  based  on  1967  studies  that  established 
10th  percentile  and  90,h  percentile  times  for  the  3-mile  run.  The  10rl'  percentile  was  deemed  the  cut- 
point  for  failure  and  the  90th  percentile  is  the  upper  limit  for  maximum  points  awarded.  In  1997, 
the  Corps  increased  the  run  distance  for  females  from  1.5  to  3  miles  to  match  the  requirement  for 
males.  The  female  time  standard  was  based  on  studies  conducted  from  1993  and  1996 
(Government  Accounting  Office,  1996)  establishing  about  a  3 -minute  increase  in  time  to  complete 
the  run  between  males  and  females. 
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The  U.S.  Marine  Corps  Order  6100. 3J  (U.S.  Department  of  the  Navy,  1988)states  that  allU.S. 
Marines  will  participate  in  a  minimum  of  3  hours  of  physical  fitness  training  per  week.  A  success¬ 
ful  Physical  Conditioning  Program  is  said  to  consist  of  the  following  types  of  exercises:  anaerobic 
conditioning,  progressive  resistance  training,  and  aerobic  conditioning. 

With  respect  to  the  development  of  male  or  female  body  fat  standards,  the  U.S.  Marine  Corps 
had  no  available  documentation  (GAO,  1998a).  The  GAO’s  interview  with  U.S.  Marine  Corps 
officials  (GAO,  1998a)  revealed  that  the  standards  were  based  on  command  judgments  regarding 
fitness  and  appearance,  as  opposed  to  actuarial  tables  or  any  other  scientific  basis.  Some  limited 
research  may  have  been  applied,  however,  because  regulation  defined  the  maximum  allowable  body 
fat  percentage  for  males  as  1 8  percent,  which  is  just  below  the  midpoint  of  the  interval  between  the 
1  Opercent  figure  said  to  be  the  average  for  marathon  runners  and  the  30  percent  figure  that  defines 
gross  obesity.  The  female  standard  of  26  percent  is  at  about  the  80  percent  point  of  the  interval 
between  the  11  percent  body  fat  level  which  the  regulation  says  is  that  of  the  average  female  gym¬ 
nast  and  the  30  percent  level  that  defines  gross  obesity  in  women  (GAO,  1998a). 

Current  Program —  The  U.S.  Marine  Physical  Fitness  Program  is  governed  by  U.S.  Marine  Corps 
Order  6100. 3J  (U.S.  Department  of  the  Navy,  1988).  The  U.S.  Marine  Corps  Physical 
Conditioning  Program  is  designed  to  promote  everyday  work  effectiveness,  combat  readiness,  lead¬ 
ership,  and  self-discipline.  U.S.  Marine  Corps  testing,  which  is  administered  twice  a  year,  differs 
slightly  for  men  and  women.  Women  and  men  run  for  3  miles.  Both  sexes  perform  sit-ups.  Women 
do  a  flexed  arm  hang  while  men  perform  pull-ups/chin-ups.  The  run  is  used  to  measure  the  effi¬ 
ciency  of  the  cardiovascularsystem,  and  the  other  events  are  designed  to  test  the  strength  and  stam¬ 
ina  of  the  upper  body  (shoulder  girdle),  midsection,  and  lower  body. 

Body  fat  standards  are  18  percent  (males)  and  26  percent  (females).  Those  exceeding 
height/weight  standards  are  referred  for  body  fat  assessment.  To  determine  body  fat,  the  U.S.  Marine 
Corps  now  uses  a  circumference  equation  that  was  first  developed  by  Wright,  Dotson,  and  Davis 
(1980,1981)  and  later  modified  by  Hodgdon  and  Beckett  (1984a,  b)  which  takes  into  account  neck 
and  abdomen  measurements  for  men  and  neck,  waist,  and  hips  for  women.  It  is  based  on  a  four-com¬ 
ponent  body  composition  criterion.  All  U.S.  Marines  under  the  age  of  46  participate  in  this  testing. 

Occupational  Fitness  —  The  U.S.  Marine  Corps  at  one  time  administered  a  physical  readiness  test 
of  combat  skills,  but  this  has  been  discontinued. 

Evaluation  of  Military  Fitness  Standards 

The  previous  section  detailed  the  fitness  and  occupational  standards  of  the  U.S.  Military 
Services.  How  these  standards  were  determined  was  also  documented,  when  this  information  was 
available.  The  current  section  reiterates  the  processes  by  which  these  standards  were  generated  to 
underscore  the  limitations  of  these  processes. 

Most  current  Military  standards  were  derived  through  the  use  of  population  norms,  based  on 
the  distribution  of  scores  obtained  on  a  performance  test.  For  instance,  in  some  cases  the  Services 
have  used  data  from  Military  populations,  including  males  and  females,  on  tests  of  sit-ups,  push- 
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ups,  and  running  to  establish  minimum  and  maximum  standards  at  percentiles  of  that  performance 
curve.  In  some  cases,  an  approach  this  systematic  was  not  applied  to  standards  for  females. 

U.S.  Army — Up  until  the  time  of  its  most  current  program  change,  the  U.S.  Army  established  min¬ 
imum  requirements  for  its  U.S.  Army  Fitness  Test  on  data  collected  in  the  1980s  (GAO,  1998b). 
These  data  formed  the  minimum  requirements  on  which  standards  were  based.  Incremental  steps 
to  the  maximums  were  based  on  simple  numerical  progressions  from  these  minimums,  not  actual 
scores.  Vogel  (1986)  reports  that  the  U.S.  Army’s  2-mile  timed  run  is  a  good  estimate  of  aerobic 
fitness,  but  push-ups  and  sit-ups  are  somewhat  inadequate  as  measures  of  general  strength.  The 
report  also  states  that  both  of  these  events  should  be  considered  primarily  muscle  endurance  meas¬ 
ures  that  are  limited  to  shoulder  and  abdominal  muscles.  While  neither  of  these  tests  correlate  well 
with  common  soldiering  tasks,  they  serve  to  stimulate  physical  training  activity. 

Last  year,  the  U.S.  Army  began  to  implement  new  standards  that  are  based  on  a  more  statisti¬ 
cally  valid  sample  base  than  in  the  past.  The  policy  behind  the  new  standards  is  a  gender-neutral 
“equal  points  for  equal  work,”  with  minimum  requirements  generally  set  on  the  8'11  percentile  of 
actual  scores  gathered  during  an  U.S.  Army  study  in  1995.  Maximums  are  based  on  90rh  percentile 
scores,  and  requirements  are  reduced  in  five-year  age  increments. 

U.S.ArRxe — A  1998  GAO  report  (GAO,  1998b)  concluded  that  US.  Air  Force  officials  had  no 
published  studies  or  other  records  to  document  the  rationale  for  their  cardiovascularendurance  stan¬ 
dards,  but  an  informal  account  of  the  U.S.  Air  Force’s  fitness  history  reveals  that  the  cardiovascular 
standard  was  based  on  limited  normative  statistics  from  a  population  of  U.S.  Air  Force  men  and 
women  in  the  early  1990s. The  population  was  divided  into  quintiles.  The  GAO  report  states  that — 

Researchers  recommended  that  the  minimum  standard  be  set  at  the  2QP’ percentile  of  performance 
because  that  was  thepoint  with  the  largest  incrementalgain  in  health  benefits  betweenpercentile 
groups.  However,  US.  Air  Force  officials  wanted  a  higher  standard  for  readiness  reasons;  as  a 
result,  the  next  percentile  grouping,  the  4011  percentile,  was  selected  as  the  minimum  standard. 
Female  standards  were  set  the  same  way  and  at  the  same  level. 


The  U.S.  Air  Force  is  fielding  its  new  strength  test,  with  standards  that  will  likely  be  based  on  U.S. 
Army  standards  with  some  possible  refinements  derived  during  this  fielding  period. 

U.S.Navy — US.  Navy  standardsfor  fitness  test  events:  l%mile  run/walk,  push-ups,  and  sit-upsfor 
men  30  and  older  are  based  on  distributions  of  actual  scores  among  the  extantpopulationgathered  dur¬ 
ing  the  past  two  years.  Earlier  minimum  requirements  ( GAO,  1998a)  were  set  at  the  10th  percentile  and 
maximums  at  the  90  to  95th  percentiles.  However,  for  the  run  time  for  women,  an  arbitrary  increment 
of  time  was  added  to  the  men’s  standard  rather  than  being  based  on  actual  run  times  of  women. 

U.S.  Marine  Corps — U.S.  Marine  Corps  standards  were  probably  based  on  1967  studies  showing 
average  3-mile  run  times,  with  maximum  times  set  at  the  90th  percentile  and  minimums  at  the  10'h 
percentile.  Studies  conducted  in  1993  and  1996  revealed  about  a  3-minute  difference  in  run  times 
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between  men  and  women,  so  this  3-minute  difference  was  added  to  the  men's  standard  scores  to 
form  the  standards  for  females  (GAO,  1998a). 


Evaluation  of  Military  Body  Composition  Standards 

The  body  composition  standards  set  by  the  Department  of  Defense  were  first  documented  in 
1981  (GAO,  1998a).  A  study  panel  made  the  recommendation  that  female  and  male  standards  be 
based  on  scientific  texts  findings  that  average  body  fat  for  physically  fit  males  was  20  percent  and 
average  body  fat  for  physically  fit  females  was  30  percent.  However,  the  actual  guidance  indicated 
a  26  percent  figure  for  females,  following  the  belief  that,  “it  was  desirable  to  recruit  women  whose 
body  fat  was  closer  to  that  of  the  average  man,  as  such  women,  possessing  a  higher  than  average 
proportion  of  fat  free  mass,  might  also  be  more  similar  to  men  in  strength  and  endurance."  DoD 
standards  were  modified  in  1995  (U.S.  Department  of  Defense,  1995),  to  between  18  and  26  per¬ 
cent  for  men  and  26  to  36  percent  for  women.  GAO  authors  indicate  that  this  change  was  made 
to  accommodate  the  range  of  values  in  effect  at  the  time  from  all  the  Services,  but  that  no  scien¬ 
tific  research  was  conducted  on  which  to  base  such  a  change. 

U.S.  Army — Friedl's  1 992  chapter  in  Body  Composition  and  Physical  Performance  outlines  the  histo¬ 
ry  and  rationale  behind  the  U.S.  Army’s  body  composition  standards. The  U.S.  Army's  current  20 
percent  figure  for  males  is  based  on  actual  scores  of  young  U.S.  Army  males  recorded  during  the 
1980s.  The  26  percent  figure  was  attained  by  increasing  the  20  percent  figure  by  two  points  for 
every  lOyears  of  increasing  age.  The  U.S.  Army’s  standards  for  females  were  determined  by  adding 
8  percentage  points  to  the  male  standard  for  each  category.  In  1991,  the  female  standards  were 
made  less  stringent  (from  28  to  34  percent  to  30  to  36  percent). 

U.S.  Air  Force — The  1998  GAO  report  (Government  Accounting  Office,  1998a, b)  indicates  that 
U.S.  Air  Force  officials  did  not  have  data  to  support  their  standards  derivation  process.  The  U.S. 
Air  Force  is  currently  considering  a  two-tier  approach  to  body  composition  standards.  The  first  tier 
would  deal  with  health  and  readiness,  and  the  second  tier  would  represent  job  specific  standards 
(Wilkinson,  Kampert,  Blair,  Baumgartner  &.  Constable,  2000). 

W.S.  Navy — The  U.S.  Navy  body  composition  standards  are  based  on  the  National  Institutes  of 
Health  definition  of  obesity.  U.S.  Navy  scientists  converted  the  weight-for-height  table  data  into 
mean  body  fat  percentages  of  about  22  percent  for  males  and  33  percent  for  females. 

U.S.  Marine  Corps  —  The  U.S.  Marine  Corps  body  fat  standards  appear  to  be  based  on  command 
judgments  for  fitness  and  appearance,  accordingto  the  GAO  (GAO,  1998a).  Some  limited  research 
may  have  been  applied. The  maximum  allowable  body  fat  for  male  U.S.  Marines  is  only  1 8  percent, 
and  the  female  standard  is  26  percent. 
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Evaluation  of  Occupational  Fitness  Standards 


The  current  use  of  an  occupational  standards  approach  by  the  U.S.  Services  is  meager.  Presently, 
the  U.S.  Air  Force  AFSCs  are  categorized  into  eight  physical  demand  categories,  and  the  U.S.  Air 
Force  uses  a  strength  aptitude  test  to  screen  out  those  recruits  who  would  not  be  likely  to  perform 
successfully  on  a  givenjob  (U.S.  Department  of  the  Air  Force,  1994). The  U.S.  Air  Force  does  not 
incorporate  this  fitness  test  into  the  required  annual  fitness  evaluation.  The  U.S.  Navy  has  not 
adopted  occupational  strength  standards  for  active  duty  personnel  or  recruits,  nor  has  the  U.S. 
Army.  The  U.S.  Marine  Corps  at  one  time  administered  a  physical  readiness  test  of  combat  skills, 
but  this  has  been  discontinued. 

The  U.S.  Air  Force  administers  a  strength  test  at  the  time  of  enlistment,  which  is  used  to  deter¬ 
mine  job  qualification.  The  test  is  an  incremental  lift  test,  called  the  SAT,  described  earlier  in  this 
chapter.  At  one  time,  the  U.S.  Army  used  a  similar  test,  but  has  discontinued  its  use.  Neither  the  U.S. 
Navy  nor  the  U.S.  Marines  performs  a  strength  assessment  at  any  time  that  is  occupationally  focused. 


Summary 


Interest  in  physical  fitness  and  exercise  is  found  in  writings  as  early  as  200  B.C.  Citizens  of  ancient 
Greece  valued  those  who  displayed  exceptional  athletic  performance  as  possessing  both  spiritual 
and  physical  strength  rivaling  the  gods  (USHHS,  1996). Military  fitness  and  physical  performance, 
or  the  lack  of  it,  has  long  been  an  issue  for  our  Military.  The  primary  intent  of  physical  standards 
in  the  Military  has  always  been  the  selection  of  Service  members  best  suited  to  the  inherent  phys- 
icaljob  demands  (Friedl,  1992).  Likewise,  industry  showed  an  early  interest  in  performance  on  the 
job  and  particularly  the  measurement  of  work  (Hogan,  1991).  For  example.  Taylor  attempted  to 
quantify  a  day’s  worth  of  work  for  a  “first  class  man”  in  the  late  1800s  (Taylor,  1923). Much  of  this 
chapter  focused  on  the  approaches  to  fitness  in  the  U.S.  Military  because  of  the  easily  available  doc¬ 
umentation  and  the  stated  strong  interest  by  the  Services.  Furthermore,  DoD  guidance  stipulates 
physical  fitness  programs  for  each  of  the  Services.  Past  programs  have  been  reviewed  here  in  some 
detail.  We  surmise  that  these  programs  have  tended  to  focus  on  “convenient”  test  batteries  (run¬ 
ning,  calisthenics,  etc.),  and  perhaps  have  had  a  historical  bias  on  physical  appearance. 
Unfortunately,  the  testing  components  rarely  seem  to  reflect  our  best  understanding  of  the  science 
(versus  the  historical  precedents).  Previously,  for  example,  Robertson  (1992)  notes  that  although 
better  scientific  methodologies  may  have  been  demonstrated,  these  improved  practices  have  elud¬ 
ed  programmatic  implementation.  Not  surprisingly,  a  recent  report  by  the  GAO  advises  that  the 
DoD  should  establish  a  mechanism  for  providing  policy  and  research  coordination  of  the  Military 
Services  physical  fitness  and  body  fat  programs  (DoD  loint  Technology  Coordinating  Group-5; 
U.S.  Army  Research  and  Materiel  Command,  1999).  On  the  other  hand,  broader  implementation 
for  fitness/performance  testing  and  standards  in  other  governmental  or  commercial  sectors  has 
likely  been  severely  constrained  by  potential  litigation  concerning  perceived  worker  discrimination 
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(see  Chapter  7).  Other  chapters  in  this  SOAR  will  describe  in  detail  the  specific  findings,  issues, 
and  practices  associated  with  the  development  of  physical  fitness/performance  standards  to  date. 
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Health-Related  Fitness  Standards: 
1  Baseline  Approach 

Stefan  H.  Constable,  Ph.D. 

USAF  School  of  Aerospace  Medicine 
Brooks  AFB,  TX 


Abstract 


This  chapter  attempts  to  build  a  rationale  for  an  ancillary  or  alternative  approach  to  fitness  stan¬ 
dards  development:  health-based  fitness  levels.  The  difficulty  with  instituting  job-specific  per¬ 
formance  tests  in  the  Military  or  elsewhere  is  identified  throughout  this  State  of-the-Art  Report 
(SOAR).  Moreover,  performance  on  current  Military  fitness  tests  does  not  correlate  well  with  per¬ 
formance  on  task-specific  job  tests  required  for  many  Military  vocations.  The  scientific  literature  is 
now  replete  in  supporting  the  strong  association  between  physical  activity/fitness  and  general 
heath,  wellness,  and  quality  of  life.  Therefore,  we  postulate  that  health-based  standards  seem  to  be 
an  adjunctive  approach  to  the  physical  fitness  standards  process.  There  may  be  greater  specific 
application  opportunities  for  the  Military.  The  focus  of  this  chapter  is  to — 

1.  Documentthe  specific  relationship  between  physical  activity  or  fitness  and  specifichealth  outcomes, 

2.  Review  exercise  prescriptions  and  investigate  the  quantitative  relationships  between  phys¬ 
ical  activity  benefits  and  measured  levels  of  fitness, 

3.  Identify  those  attempts  to  produce  a  specific  cut-point  for  the  specified  (identified)  fitness 
components,  and 

4.  Assess  at  least  qualitatively,  the  validity  of  those  health-based  fitness  approaches. 

Identifying  the  minimal  dose-response  (versus  adequate  or  optimal)  relationships,  not  to  mention 
truly  testable  metrics  or  standards,  has  proved  the  greatest  challenge.  However  attractive  or  meritori¬ 
ous  this  endeavor  may  initially  seem,  the  basic  observation  should  be  that  in  application  or  practice 
varying  levels  of  difficulty  may  be  encountered,  depending  on  the  chosen  fitness  modality  application, 
that  is,  aerobic,  strength,  or  body  composition.  This  possibility  is  primarily  due  to  the  lack  of  suffi- 
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cient  data  and/or  discrete  methodologies  to  identify  clearly  defined  cut-points  on  which  to  base  stan¬ 
dards.  Nevertheless,  this  should  not  deter  further  efforts  to  investigateor  apply  alternative  procedures. 


Health  and  Fitness  Across  the  Workplace 


Developing  fitness  standards  has  usually  been  determined  by  some  minimum  level  of  individ¬ 
ual  physical  capacities  that  relate  well  to  occupational  job  performance  for  candidate  selection, 
worker  retention,  or  advancement.  Although  there  are  currently  many  workplace  scenarios  in  which 
the  physical  requirements  of  the  job  are  often  quite  low,  there  is  still  significant  interest  in  achiev¬ 
ing  or  maintaining  a  reasonable  fitness  level  for  all  workers.  For  a  variety  of  reasons,  including  bet¬ 
ter  health,  everyday  work  efficiency,  improved  cognitive  functioning,  good  appearance,  and 
increased  readiness,’  fitness  is  essential  especially  in  the  Military.  This,  of  course,  implies  a  corre¬ 
sponding  measure  or  metric  of  some  level  of  physical  fitness,  with  a  resultant,  positive  impact  on 
occupational  performance.  Thus,  the  issue  of  a  general  or  baseline  fitness  requirement  in  the  work¬ 
place  population  or  the  Military  is  twofold — 

1.  A  basic  level  of  fitness  for  overall  health,  and 

2.  Increased  levels  of  fitness  for  optimum  performance  for  occupational  and  recreational  activities. 

Currently,  the  Department  of  Defense  considers  physical  fitness  an  important  component  of 
the  “general  health  and  well-being”  and  readiness  of  all  Military  members  (Institute  of  Medicine, 
1998).  Therefore,  this  chapter  explores  the  concept  of  baseline,  health-related  fitness  requirements 
with  potential  application  to  selected  Military  and/or  civilian  environments.  This  chapter  also  pres¬ 
ents  the  theoretical  merit  for  this  more  generic  approach  to  the  process  of  physical  fitness  standards 
development,  and  identifies  the  methodological  procedures  and  precedents  for  further  application. 


General  Health  and  Fitness  Descriptors 


First,  it  is  necesssary  to  identify  a  framework  or  general  definition  associated  with  health. 
Although  the  specific  nomenclature  used  may  be  diverse,  individual  health  status  clearly  exists  on  a 
continuum  as  does  physical  fitness  (Institute  of  Medicine,  1998).  The  World  Health  Organization 
has  defined  health  as  a  positive  state  of  physical,  mental,  and  social  well-being  (U.S.  Department  of 
Health  and  Human  Services,  1996)  as  opposed  to  just  the  absence  of  disease  or  disability.  In  addi¬ 
tion,  health-related  quality  of  life  (HRQL)  has  received  considerable  attention  on  clinical,  scientif¬ 
ic,  and  public-interest  fronts  (U.S.  Department  of  Health  and  Human  Services,  1996). 

The  preceding  terminology  relates  well  to  the  detailed  characterizations  of  health-related  fit¬ 
ness  expressed  in  Bouchard  and  Shephard’s  chapter  in  the  Second  International  Consensus  Symposium 
on  Physical  Activity,  Fitness,  and  Health  (1993) — 
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Fitness  is  operationalized  inpresent-day  Western  societies  with  a  focus  on  two  goals:  performance 
and  health.  P  erf ormance-relatedfitness  refers  to  those  components  of  fitness  that  are  necessary  for 
optimal  work  or  sportPerformance.  Health-relatedfitness  refers  to  those  components  offitness  that 
are  affected favorably  or  unfavorably  by  habitualphysical activity  and  relate  to  health  status.  It  has 
been  defined  as  a  state  characterized  by  (a)  an  ability  to  perform  daily  activities  with  vigor,  and 
(b)  demonstration  of  traits  and  capacities  that  are  associated  with  a  low  risk  of  premature  devel¬ 
opment  of  hypokinetic  diseases  and  conditions.  Important  components  of  health-related  fitness 
include  body  mass  for  height ,  body  composition,  subcutaneousfat  distribution,  abdominal'  visceral 
fat,  bom  density,  strength  and  endurance  of  the  abdominal  and  dorso-lumbar  musculature,  heart 
and  lungjunction,  blood  pressure,  maximal' aerobicpower  and  capacity  glucose  and  insulin  metab¬ 
olism,  blood  lipid  and  lipoprotein  profile,  and  the  ratio  of  lipid  to  carbohydrate  oxidized  in  a  vari¬ 
ety  of  situations.  Afavorable  profile  for  these  various factors  presents  a  clear  advantage  in  terms  of 
health  outcomes  as  assessed  by  morbidity  and  mortality  statistics,  (p.  15) 

For  this  chapter,  three  main  categories  of  health-related  fitness  are  identified  as  follows — 

1.  Cardiorespiratory  fitness  is  the  ability  of  the  respiratory  and  circulatory  systems  to  adapt 
to  and  recover  from  vigorous  activities  that  involve  large  muscle  groups,  thus  increasing 
the  heart  rate  and  blood  circulation.  Examples  of  these  activities  are  walking,  jogging, 
swimming,  and  biking. 

2.  Body  composition  is  the  ratio  of  body  fat  to  lean  body  tissue  (muscle  and  bones,  etc.) 
expressed  as  a  percentage. 

3.  Musculoskeletal  fitness  may  be  subdivided  into  muscular  strength,  muscular  endurance, 
and  whole-body  flexibility. 

Strength  is  measured  by  determining  the  one-repetition,  maximal  force  that  can  be  exerted  against 
a  resistance,  for  example,  lifting  a  weighted  bar.  Muscular  endurance  is  the  ability  to  repeatedly  apply 
force  to  resistance.  Endurance,  for  example,  could  be  measured  by  the  number  of  repetitions  one  can 
perform  of  a  particular  exercise  such  as  sit-ups.  Flexibility  is  the  ability  to  operate  through  a  "£11"  range 
of  motions  such  as  touching  one's  toes  while  in  a  seated  position  with  the  legs  straight  (Nieman,  1998). 

Often  these  definitions  tend  to  discriminate  physical  skills  such  as  speed,  agility,  power,  and 
coordination  from  physical  fitness.  Although  physical  skill  is  useful  for  sports  and  certain  job/task 
performance  (and  may  improve  with  increased  physical  activity),  it  is  important  to  note  that  ath¬ 
letic  skill  is  generally  not  necessary  for  health  improvement  or  disease  prevention. 

Table  2.1  (derived  from  Pate,  1988)  demonstrates  the  overlap  among  more  specific  components 
suggested  to  fall  under  the  generic  fitness  label  as  they  might  relate  to  motor  performance,  physi¬ 
cal  fitness,  and  health-related  fitness. 

Metabolic  Fitness 

More  recently,  an  analogous  category  of  health-related  fitness  —  metabolic  fitness — was  intro¬ 
duced  by  Despres  et  al.  (1990)  and  conceptualized  in  the  American  College  of  Sports  Medicine 
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Table  2.1  Components  of  motor  performance,  physical  fitness,  and  health-related  physical  fitness 
Journal  of  the  American  Medical  Association,  1995, 273, 40. 


Component 

Motor  Performance 

Physical  Fitness 

Health-related  Physical  Fitness  I 

Anaerobic  Power  | 

i 

Speed 

MuscularStrength 

Muscular  Endurance 

Cardiorespiratory  Endurance 

Flexibility 

Body  Composition 

Agility 

l 

Position  Stand  (Pollock  et  al.,  1998).  This  concept  “describes  the  state  of  metabolic  systems  and 
variables  predictive  of  the  risk  of  diabetes  and  CV  disease  which  can  be  favorably  altered  by 
increased  physical  activity  or  regular  endurance  exercise  without  the  requirement  of  a  training- 
related  increase  in  VC^max”  (p.  976  ).  Gaesser  (1996)  approaches  a  desirable  level  of  metabolic  fit¬ 
ness  in  yet  another  way.  He  prescribes  a  healthy  diet,  (less  than  or  equal  to  20  percent  fat  intake), 
and  moderate  and  frequent  exercise  at  the  least.  Good  metabolic  fitness  will  be  then  evidenced  by 
normal  outcomes  for  blood  pressure,  blood  sugar,  and  lipid  profiles,  regardless  of  the  resultant  body 
weight.  The  key  here  is  really  a  low-fat  diet,  basically  without  caloric  restriction,  with  the  general 
focus  on  a  healthy  lifestyle  without  attention  to  body  weight. 

Although  this  metabolic  approach  to  general  fitness  may  have  significant  scientific  merit,  it 
seems  most  problematic  if  one  is  interested  in  validly  monitoring  compliance:  abnormal  medical 
measures/ outcomes,  that  is,  blood  pressure,  blood  sugar,  lipid  profiles,  and  intra-abdominal  fat  may 
not  be  solely  due  to  exercise  and  diet  habits.  In  other  words,  how  would  one  evaluate  the  true  rigor 
with  which  each  person  followed  the  exercise  and  diet  prescription  other  than  by  self-report?  It  cer¬ 
tainly  would  not  be  feasible  in  a  highly  accountable  Military  setting.  Moreover,  these  concerns  are 
especially  valid  for  the  younger  population  in  which  the  metabolic  consequences  of  noncompliance 
can  take  years  to  manifest  into  morbidities  or  premorbidities. 


The  Relationships  Between  Physical  Activity,  Fitness,  and  Health 


Although  improving  one’s  physical  performance  is  desired  by  many  people,  maintaining  or 
improving  one’s  personal  health  is  by  far  the  number  one  priority.  This  is  especially  true  in  today’s 
increasingly  health-conscious  environment,  in  which  the  physical  requirements  of  our  jobs  are  con¬ 
sistently  diminishing.  Improvements  in  fitness  levels  generally  correlate  very  well  with  both 
increased  physical  performance  and  health.  Clearly,  the  greatest  health  benefits  are  derived  when 
one  improves  from  low  or  modest  levels  of  initial  fitness  (Blair,  1995). On  the  other  hand,  elite  ath¬ 
letes  must  expend  significant  amounts  of  time  and  effort  to  achieve  relatively  very  small  increases 
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in  performance  improvement  (or  fitness).  Ironically,  this  latter  goal  may  even  be  at  the  expense  of 
good  health  because  of  increased  injuries,  risk  of  infection,  and  other  physiological  maladies.  Still, 
the  relationships  between  physical  activity,  fitness,  and  health  tend  to  be  strongly  and  positively 
related:  more  activity  is  generally  better.  It  is  noteworthy  that  these  parameters  also  tend  to  reside 
on  a  continuum  of  effects  with  a  distinct  degree  of  interindividual  variability. 

Although  it  has  been  argued  that  the  degree  of  improvement  in  health  status  is  often  closely 
dependent  on  the  magnitude  of  the  improvement  in  fitness, it  is  becoming  more  apparentthat  this  rela¬ 
tionship  is  not  so  simple  (Haskell,  1994).  Indeed,  Bouchard  and  Shephard  (1993)  state,  “...the  rela¬ 
tionships  between  the  levels  of  physical  activity,  health-related  fitness,  and  health  are  complex”  (p.  1 1). 
These  scholars  generally  define  health-related  fitness  as  one's  ability  to  perform  daily  activities  with 
vigor  (Institute  of  Medicine,  1998)  and  have  attempted,  at  least  notionally,  to  describe  these  associa¬ 
tions  in  a  model  of  health-related  fitness  as  shown  in  Figure  2.1.  This  model  clearly  supports  our  gen¬ 
eral  concept  of  an  ancillary  approach  to  fitness  standards  development:  health-based  fitness  levels. 
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Figure  2.1  Relationship  among  habitual  physical  activity,  health-related  fitness,  and  health  status. 
Reprinted,  by  permission,  from  C.  Bouchard,  R.  J.  Shepard,  &  T.  Stephens,  1993.  Physical  Activity,  Fitness, 
and  Health:  Consensus  Statement.  Champaign,  IL:  Human  Kinetics,  12. 

Activity  and  fitness  levels  are  therefore  associated  in  a  reciprocal  manner:  If  a  person  engages  in 
more  activity,  fitness  level  improves.  Conversely,  if  a  person  incorporates  little  or  no  activity  in  his 
or  her  life,  fitness  level  diminishes.  In  addition,  a  higher  level  of  fitness  increases  one’s  ability  to 
engage  in  a  broader  variety  of  activities  and  at  greater  intensities  that  in  turn  provides  a  greater  level 
of  fitness,  and  therefore  more  opportunity  to  become  even  fitter.  In  this  case,  the  self-fulfilling 
prophecy  is  a  positive  one.  At  the  initial  onset  of  activity,  there  may  be  minimal  physical  ability.  By 
repeating  the  exercises  one  can  adapt,  improving  physical  health,  enabling  more  activities  to  be 


Human  Systems  IAC  SOAR,  2000 


31 


undertaken,  and  ultimately  improving  readiness. The  close  association  and  tightly  interlaced  nature 
of  exercise  and  health  are  vividly  apparent  throughout  the  research  regarding  these  two  elements. 

One  of  the  major  recognized  avenues  for  maintaining  or  improving  health  is  through  physical 
activity. The  Surgeon  General’s  report  (U.S.  Department  of  Health  and  Human  Services,  1996)  has 
defined  physical  activity  as  “'bodily  movement  produced  by  the  contraction  of  skeletal  muscle  that 
increases  energy  expenditure  above  the  basal  level”  (p.  20).  The  American  College  of  Sports 
Medicine  Position  Static!  (1990)  further  indicates  that  the  adaptive  responses  to  physical  training  (or 
exercise)  are  complex  and  normally  include  peripheral,  central,  structural,  and  functional  factors 
(Pollock  et  al.,  1 998).  Individual,  physiologic  adaptations  have  generally  been  well  characterized  in 
the  literature.  However,  comparable  data  are  insufficient  relative  to  the  specific  intensity,  frequen¬ 
cy,  and  duration  of  training  necessary  to  confidently  predict  the  precise  quantification  of  the  ben¬ 
efit  outcomes  (Pollock  et  ah,  1998).  In  other  words,  it  is  very  difficult  to  specifically  characterize 
each  of  the  individual  dose-response  relationships. 

In  the  Physical  Activity,  Fitness,  and  Health  International  Proceedings  and  Consensus 
Statement  (Bouchard,  Shephard,  8c  Stevens,  1994),  Haskell  offers  a  more  detailed  description  of 
the  physiological  effects  of  exercise  (defined  as  any  sustained  or  repeated  movement  of  a  relatively 
large  skeletal  muscle  mass)  that  occur  through  the  body’s  immediate  response  to  the  exercise,  the 
adaptations  that  occur  over  time  as  the  body  increases  its  capacity  or  efficiency, or  some  combina¬ 
tion  of  these  acute  and  chronic  responses.  Haskell  further  explains — 

During  andfollowing  this  muscle  contraction,  local  biochemicalfactors,  along  with  activation  of 
the  central  nervous  system,  stimulate  the  increased  activity  of  various  hormones  and  enzymes  that 
help  regulate  key  metabolic  functions;  and  there  are  major  shifts  in  cardiorespiratory  performance. 

If  the  activity  is  of  sufficient  intensity  and  duration,  the  renal,  hepatic,  gastrointestinal,  and 
immune  systems  become  involved  Also,  physical  activity  exerts  physical  forces  on  the  bones,  mus¬ 
cles,  and  connective  tissue  as  a  result  of  muscle  contraction  or  in  resjxn  ise  to  gravity,  (p.  1030) 


Specific  Impact  of  Activity  and  Fitness  on  Personal  Health 


“Throughout  history,  numerous  health  professionals  have  observed  that  sedentary  people 
appear  to  suffer  from  more  maladies  than  active  people.”  This  early  example  may  be  found  in  the 
writings  of  English  physician  Thomas  Cogan,  author  of  The  Haven  of  Health  (1584);  he  recom¬ 
mended  his  book  to  students  who,  because  of  their  sedentary  ways,  were  believed  to  be  most  sus¬ 
ceptible  to  sickness  (U.S.  Department  of  Health  and  Human  Services,  1996). 

The  influence  of  physical  activity  on  disease  prevention  and  mitigation,  longevity,  and  overall 
health  has  been  under  continual  scrutiny  for  the  past  few  decades.  Several  intensive  reviews  of  these 
topics  have  been  mentioned  earlier,  including  Physical  Activity  and  Health  — A  Report  of  the  Surgeon 
General  (U.S.  Department  of  Health  and  Human  Services,  1996),  and  Physical  Activity,  Fitness  and 
Health  Consensus  Statement  (Bouchard,  Shephard,  &Stevens,  1993). Other  studiesby  leading  experts 
have  scientifically  explored  the  beneficial  effects  of  physical  activity  on  several  major  conditions  and 
diseases.  For  example,  these  afflictions  include  obesity  (DePietro,  1995;  Stefanick,  1993),  hyperten- 
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sion  (Hagberg,  Montain,  Martin,  8t  Ehsani,  1989),  cardiovascular  disease  (CVD)  (Lee,  Blair,  8c 
Jackson,  1999;Stofan,  Dipietro,  Davis,  Kohl,  &Blair  1998;Farrell  et  al„  1998),  and  diabetes  (Hu  et 
al.,  1999).  Other  studies  have  investigated  the  possible  benefits  of  physical  exercise  on  osteoporosis 
(Recker  et  al.,  1992),  cancer  (Sesso,  Paffenbarger,  Ha,  8c  Lee,  1998),  clinical  depression,  stroke 
(Kiely.Wolf,  Cupples,  Beiser,  8c  Kannel,  1994),  and  musculoskeletal  health  (e.g.,  back  pain,  limb 
function).  Refer  to  Appendix  A  for  more  detail. 

Moreover,  physical  activity  has  been  shown  to  have  salutary  effects  on  more  than  one  condition 
at  a  time.  For  example,  frequent  exercise  has  been  shown  to  alleviate  obesity  (DePietro,  1995),  and 
obesity  has  been  associated  with  other  conditions,  including  diabetes  (Helmrich.  Ragland,  Leung, 
8c  Paffenbarger,  1991),  hypertension  (Wier,  1992),  CVD  (Lee,  Blair,  &Jackson,  1999),  and  stroke 
(Gorelick  et  al.,  1999).  In  addition  to  alleviating  obesity  as  a  risk  factor  for  other  conditions,  phys¬ 
ical  activity  has  been  studied  in  direct  relation  to  the  prevention  of  diseases  and  reduction  in  over¬ 
all  mortality  rates  (Lee,  Blair,  &Jackson,  1999;  Blair  et  al.,  1989). 

Interrelated  Impact  on  Conditions 

The  broad  implications  of  engaging  in  physical  activity  create  a  positive  domino  effect  on  dis¬ 
ease  prevention.  In  other  words,  moderate  levels  of  physical  activity  can  have  very  positive  benefits 
for  many  different  illnesses  and  conditions  at  the  same  time.  For  example,  if  a  person  can  control 
obesity  by  exercising  regularly,  other  disease  risks  are  reduced  for  such  illnesses  as  diabetes,  heart  dis¬ 
ease,  and  even  clinical  depression. By  eliminating  (or  reducing  the  severity  of)  obesity  through  phys¬ 
ical  activity,  the  risk  of  hypertension  diminishes. This  is  also  true  of  stroke  whose  risk  can  be  reduced 
twofold  because  hypertension  and  obesity,  two  of  the  biggest  risk  factors  for  stroke  onset,  can  be  con¬ 
trolled  simultaneously. In  addition,  the  risk  of  diabetes,  coronary  heart  disease,  cancer,  and  possibly 
even  depression  is  reduced.  As  each  condition  is  prevented,  a  person's  overall  health,  longevity,  and 
quality  of  life  is  improved.  The  overwhelming  benefit  is  that  all  of  these  risks  maybe  reduced  by  the 
same  intervention  —  moderate  physical  activity  that  is  part  of  one's  everyday  habits.  Moreover  the 
feeling  of  accomplishment  and  improved  well-being  further  promotes  greater  exercise  compliance. 

This  proposition  is  supported  by  research  findings.  Lee,  Blair,  and  Jackson  (1999)  examined  the 
health  benefits  ofleanness  in  relation  to  cardiorespiratoryfitness  and  all-cause  mortality.  They  col¬ 
lected  data  on  21,925  men,  aged  30  to  83.  Over  the  course  of  eight  follow-up  years  there  were  428 
deaths.  The  researchers  adjusted  for  age,  examination  year,  cigarette  smoking,  alcohol  intake,  and 
parental  history  of  ischemic  heart  disease.  They  found  that  the  unfit  lean  men  had  twice  the  risk  of 
all-cause  mortality  than  that  of  fit  lean  men  as  measured  by  maximal  exercise  testing  and  body  com¬ 
position  assessment.  They  also  discovered  that  unfit,  lean  men  had  a  higher  risk  of  all-cause  and 
CVD  mortality  than  the  men  who  were  fit  and  overweight.  Similarly,  incorporating  waist  girths  as 
a  measure  of  fatness,  the  men  who  were  physically  unfit  had  a  higher  risk  of  all-cause  mortality  than 
those  who  were  fit,  even  if  the  unfit  men  were  thinner.  Those  with  a  greater  fitness  level,  regardless 
of  fat  mass,  tended  to  be  at  lesser  risk.  The  authors  concluded  that  the  health  benefits  of  leanness 
are  limited  to  fit  men,  and  being  fit  may  reduce  the  hazards  of  overweight  and  obesity. 

A  further  example  of  the  interrelatedness  of  conditions  is  found  in  the  connection  between 
depression  and  coronary  heart  disease.  According  to  the  National  Institute  of  Mental  Health 
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(NIMH)  (1999),  “Research over  the  past  two  decades  has  shown  that  depression  and  heart  disease 
are  common  companions  and  what  is  worse,  each  can  lead  to  the  other.” The  findings  regarding 
depression  preceding  heart  disease  show  a  more  rapid  heartbeat,  higher  blood  pressure,  and  faster 
blood  clotting  in  those  who  are  depressed.  Elevated  insulin  and  high  cholesterol  may  also  be  side 
effects  of  depression  and  are  contributors  to  heart  disease.  Other  NIMH  studies  found  that 
depressed,  heart  disease-free  patients  were  4  times  more  likely  to  have  a  heart  attack  in  the  subse¬ 
quent  14  years.  In  addition,  these  studies  indicated  that  heart  patients  who  were  depressed  were 
four  times  more  likely  to  die  in  the  subsequent  six  months  than  those  who  were  not  depressed. 
Although  one  in  six  people  will  have  an  episode  of  major  depression  in  his  or  her  lifetime,  the  num¬ 
ber  increases  to  one  in  two  for  those  who  have  heart  disease.  Since  depression  is  the  leading  cause 
of  disability  worldwide  and  heart  disease  is  the  leading  cause  of  death  in  the  United  States,  the 
combination  of  these  conditions  is  synergistic  and  results  in  a  major  impact  on  individual  health 
and  our  society’s  economic  growth. 

Furthermore,  exercise  is  specified  as  an  effective  intervention  for  both  heart  disease  and  depres¬ 
sion  and  also  for  other  conditions  (e.g.,  hypertension,  insulin  sensitivity) associated  with  those  two 
diseases.  Hypertension  cannot  be  ignored  when  discussing  the  relationship  between  diseases.  It  is  a 
major  risk  factor  for  several  conditions  (e.g.,  stroke,  cardiovascular  disease,  and  diabetes).  The  effects 
of  physical  activity  on  reducing  blood  pressure  are  clearly  documented  and  directly  related.  The  cur¬ 
rent  recommendation  to  engage  in  moderate  physical  activity  most  days  of  the  week  has  been  shown 
to  have  a  salutary  effect  on  reducing  hypertension.  The  key  factor  for  reducing  blood  pressure  is  con¬ 
tinual,  frequent  activity  throughout  life.  It  is  possible  that  if  a  few  key  risk  factors  (e.g.,  obesity, 
hypertension,  and  depression)  for  specifically  Me-threatening  illnesses  (e.g.,  heart  disease  or  stroke) 
are  reduced,  then  the  rate  of  multiple  disease  conditions  will  be  reduced  as  well.  The  consistency  of 
the  research  findings  lies  in  the  high  positive  correlation  between  frequent  activity  at  any  level  and 
salutary  effects  on  so  many  of  the  conditions  described  in  this  chapter  and  Appendix  A.  With  all  of 
the  evidence  relating  the  aforementioned  diseases  to  an  inactive,  sedentary  lifestyle,  there  is  a  lot  of 
reason  to  heed  the  government’s  standards  of  regular,  moderate  activity. 


Reduced/Increased  Risk  of  Injury 


Workers  who  are  fit  are  less  likely  to  have  on-the-job  injuries.  An  extensive  review  of  the  role 
of  physical  training  in  preventing  occupational  injuries  can  be  found  in  a  1992  Ergonomics  article 
entitled,  “Physical Training:  A  Tool  of  Increasing  Work  Tolerance  Limits  of  Employees  Engaged 
in  Manual  Handling  Tasks”  (Genaidy  et  al.,  1992).  The  article  emphasizes  that  a  lack  of  physical 
fitness  is  a  contributing  factor  to  musculoskeletal  injuries  resulting  from  manual  material  handling 
in  particular  (Palmer  8c  Soest,  1997). 

A  recent  supplement  to  the  American  Journal  of  Preventive  Medicine,  entitled.  Injuries  in  the  US. 
Armed  Forces:  Surveillance, Researchand Prevention  (Jones  8c  Amoroso,  2000)  has  focused  on  the  spe¬ 
cific  issue  of  injuries  in  the  Military.  Altarac  et  al.  (2000)  point  out  that  among  the  number  of  risk 
factors  for  injury  previously  identified,  low  physical  fitness  and  lower  amounts  of  physical  activity  are 
perhaps  in  the  top  five.  Some  risk  factors  may  tend  to  be  additive  in  their  effects  as  well.  Not  sur¬ 
prisingly,  prior  smoking  tended  to  predict  lower  levels  of  physical  performance  in  U.S.  Army  recruits. 
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Obviously,  exercise  produces  extensive  benefits,  but  there  are  also  possible  risks  associated  with 
regular  physical  activity.  Although  the  most  severe  of  these  risks  may  be  the  risk  of  heart  attack,  the 
most  prevalent  and  critical  risk  of  an  exercise  program  would  be  to  not  continue  it.  A  sedentary 
lifestyle  with  little  or  no  sustained  physical  activity  is  the  precursor  to  being  physically  unfit,  which 
in  turn  has  been  demonstrated  to  be  associated  with  increased  risk  of  illness  and  decreased  quality 
of  life.  Therefore,  to  become  fit,  a  lifestyle  change  must  occur.  When  implementing  changes  in 
activity  level,  precautions  must  be  taken  to  avoid  musculoskeletal  injuries  that  would  obviously 
deter  people  from  further  activity.  To  protect  oneself  against  the  risks  associated  with  beginning  an 
exercise  program,  a  physically  unfit  person  should  check  with  his  or  her  doctor  for  any  existing  con¬ 
ditions  that  would  require  special  attention  (e.g.,  extremely  high  blood  pressure,  existing  heart  dis¬ 
ease,  or  musculoskeletal  problems).  The  precautions  set  by  the  physician  should  be  followed,  the 
activity  program  should  be  pursued  in  a  gradual  and  rational  manner,  and  the  activity  itself  should 
be  modified  to  each  person’s  beginning  fitness  level. 

T o  become  reasonably  fit,  one  must  frequently  engage  in  low-  to  moderately-intense  physical 
activity.  As  mentioned  previously,  understanding  that  highly  intense,  vigorous,  and  prolonged  exer¬ 
cise  is  not  required  to  improve  health  is  essential  in  moving  from  no  activity  to  some  activity.  Such 
vigorous  exercise  is  intimidating  and  risky  for  a  physically  unfit  person.  When  first  starting  a  pro¬ 
gram,  performing  light  exercise  reduces  the  risks  of  developing  strained,  pulled,  or  overly  sore  mus¬ 
cles.  When  people  exercise  in  excess  of  what  their  bodies  can  handle,  they  can  incur  injuries. 
Researchers  have  found  a  greater  risk  of  injury  associated  with  those  having  both  higher-  and 
lower-than  average  body  mass  index  (BMI)  (Jones  et  al.,  1992).  Since  many  sedentary  people  fall 
outside  the  average  BMI  category,  it  is  important  to  note  the  possibility  of  injury  and  the  impor¬ 
tance  of  proper  exercise  techniques  (e.g.,  warming  up,  stretching,  and  cooling  down).  In  addition, 
knowing  the  beneficial  effects  of  light  exercise  or  even  increasing  everyday  activities  (such  as  walk¬ 
ing  up  and  down  stairs)  is  encouraging  for  those  who  are  not  ready  or  willing  to  engage  in  a  pro¬ 
gram  of  vigorous  activity.  The  risk  of  injury  or  even  discomfort,  which  may  preclude  regular  exer¬ 
cise  habits,  is  diminished  by  the  current ‘'increased  lifestyle  activity”recommendations.  The  impli¬ 
cation  of  this  approach  is  that  more  people  will  move  from  the  “physicallyunfit”  to  the  ‘"moderate¬ 
ly  fit”  category,  thus  significantly  reducing  mortality  and  morbidity  rates. 

In  a  study  of  lOOyoung  athletes  (average  age  18)who  died  suddenly  while  exercising,  90  had  heart 
or  blood  vessel  birth  defects.  Congenital  CVD  is  now  the  major  cause  of  athletic  death  in  high  school 
and  college  (Nieman,  1998).  Other  reasons  for  sudden  heart  attack  while  exercising  are  known  or 
unknown  heart  disease,  usually  related  to  a  family  history  of  the  disease.  A  study  by  researchers  at 
Harvard  found  heart-attack  risk  to  be  5.9  times  higher  after  heavy  physical  exertion  versus  lighter  or 
no  exertion  (Nieman,  1998).  This  risk  was  especially  acute  for  those  who  were  habitually  inactive, 
while  those  who  were  accustomed  to  exercise  had  much  less  risk  of  heart  attack  when  engaging  in 
heavy  activity.  An  additional  study  of  36  marathon  runners  who  had  heart  attacks  found  that  most  of 
the  runners  had  a  strong  family  history  of  heart  disease,  suffered  early  warning  symptoms  of  the  dis¬ 
ease  itself,  but  did  not  take  precautions.  Most  researchers  have  found  that  fewer  than  10  out  of 
100,000  men  will  have  a  heart  attack  during  exercise.  Most  likely  these  were  men  who  were  seden¬ 
tary  and  then  engaged  in  physical  activity  with  at  least  a  high  risk  for  heart  disease.  For  this  reason,  it 
is  important  for  people  to  be  screened  and  cleared  by  their  physicians  before  starting  an  activity  pro¬ 
gram.  The  ACSM  (1994)  states  that  “The  incidence  of  cardiovascularproblems  during  physical  activ- 
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ity  is  reduced  by  nearly  50  percent  when  individuals  are  first  screened,  and  those  who  are  identified 
with  risk  factors  are  diverted  to  other  professionally  established  activity  programs”  (pp.  i-v). 

Although  there  are  risks  associated  with  engaging  in  activity,  the  greatest  risk  is  engaging  in  no 
activity.  People  at  all  fitness  levels  can  safely  implement  the  current  activity  recommendations  for 
good  health,  tailoring  a  program  to  specific  needs  if  necessary.  Clearly,  if  a  person  takes  the  afore¬ 
mentioned  precautions  and  consistently  engages  in  physical  activity  throughout  life,  the  death  rate 
from  preventable  diseases  would  drop  significantly. 


General  Sense  of  Well-Being 

In  addition  to  the  mental  health  benefit  of  less  depression  alluded  to  earlier,  regular  exercise  can 
also  reduce  levels  of  anxiety,  tension,  and  reaction  to  life’s  stresses.  Unfortunately,  a  person’s 
improved  well-being  is  difficult  to  assess  and  quantify.  In  addition,  an  individualized  program  that 
will  assuredly  affect  psychological  well-being  is  difficult  to  prescribe.  Still  it  is  clear  that  physical 
activity  can  positively  affect  many  areas  of  mental  health  (Bouchard  et  al.,  1993). 

Kaplan  and  Bush  (U.S.  Department  of  Health  and  Human  Services,  1996)perhaps  took  this  fur¬ 
ther  in  a  construct  representing  a  person’s  overall  satisfaction  with  health-related  quality  of  life 
(HRQL).  The  intent  was  to  capture  the  influence  that  health  status  and  care  have  on  the  quality  of 
life.  Specific  effects  of  exercise  on  HRQL  include  psychological  well-being,  perceived  physical  func¬ 
tion  and  well-being,  and  perhaps  cognitive  function.  A  recent  review  (McAuley,  1994)  suggested 
that  positive  associations  between  self-esteem  and  physical  activity  exist  for  both  young  adults  and 
children.  This  was  also  true  whether  the  activity  was  chronic  (long-term  training)  or  acute  (a  single 
bout  of  activity).  These  findings  are  basically  independent  of  age,  and  with  older  adults  significant 
improvements  in  psychological  well-being  may  be  found  without  improvement  in  aerobic  fitness. 

Everyday  Work  Performance  and  Cognition 

Besides  deployment  readiness,  the  everyday  work  environment  benefits  from  a  fit  work  force. 
Workers  who  are  fit  are  more  productive,  happier,  absent  less  often,  and  incur  fewer  on-the-job 
injuries.  Healthy  and  fit  workers  are  more  productive.  They  suffer  less  from  fatigue  and  make  fewer 
errors  (Shephard,  1992).  Fit  workers  file  fewer  insurance  claims,  and  injure  themselves  less  fre¬ 
quently.  A  NASA  and  U.S.  Public  Health  Service  survey  (Durbeck  et  al.,  1973)  of  more  than  200 
Federal  employees  who  participated  in  a  worksite  exercise  program  revealed  that  workers  who  exer¬ 
cised  felt  that,  as  a  result,  they  could  work  harder  mentally  and  physically,  they  enjoyed  their  work 
more,  and  found  their  normal  work  routine  less  boring. 

Overall,  a  number  of  studies  tend  to  suggest  potentially  beneficial  associations  between  worker 
health  and  productivity.  Although  the  relationships  are  statistically  significant,  the  true  strength  of 
the  relationships  may  be  generally  low  (Bouchard  et  al.,  1993). Therefore  uncertainty  exists  regard¬ 
ing  the  expected  cost-benefit  ratio  when  instituting  an  employee  fitness  program.  In  fact,  a  review 
of  numerous  health  promotion  program  outcomes  drew  somewhat  equivocal  conclusions  regarding 
their  true  efficacy  because  of  limitations  to  the  experimental  design.  However,  given  the  paucity  of 
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more  rigorous  research  in  the  area,  improved  health  as  an  outcome  of  increased  physical  activity  has 
its  own  inherent  return,  regardless  of  a  more  favorable  cost-benefit  ratio  defined  in  strict  monetary 
terms  (Bouchard  et  al.,  1993). 


Medical  Costs  $  I  Itl  enefits 

Estimates  are  made  about  cost  to  the  general  economy  due  to  lifestyle-related  diseases,  but  it  is 
more  difficult  to  estimate  costs  of  an  unfit  work  force.  In  total,  the  studies  cited  previously  indicate 
that  health  costs  would  be  lower  for  groups  of  employees  who  exercised  more  than  other  groups. 
More  specifically,  the  Canadian  Life  study  showed  that  medical  care  costs  were  decreased  for  a 
group  who  exercised  compared  with  workers  who  did  not  (Nieman,  1995).  At  Prudential,  disabil¬ 
ity  days  were  reduced  more  than  20  percent  for  employees  who  participated  in  a  fitness  program, 
and  the  fitter  workers  had  a  46  percent  lower  rate  of  major  medical  costs.  Tenneco  saw  48  percent 
lower  medical  costs  for  an  exercise  group  compared  with  nonexercisers  (Nieman,  1995). 

A  1992  U.S.  Department  of  Health  and  Human  Services  (U.S.  Department  of  Health  and 
Human  Services,  1992)  survey  included,  as  benefits  of  a  physically  fit  work  force,  improved  employ¬ 
ee  morale,  reduced  health  insurance  costs,  reduced  absenteeism,  increased  output  and  productivity, 
reduced  on-the-job  accidents,  and  fewer  workers’  compensation  claims.  Other  fitness  benefits  in  the 
workplace  are  further  documented  by  Canadian  Life  (Nieman,  1995),  whose  absenteeism  rate 
dropped  50  percent  by  employees  who  were  “high  adherents”  in  a  fitness  program.  At  Prudential, 
disability  days  were  reduced  more  than  20  percent  for  employees  who  participated  in  a  fitness  pro¬ 
gram.  Tenneco  saw  a  trend  for  fewer  sick  hours  for  exercisers  versus  nonexercisers  (Nieman,  1995). 

Mortality  and  Level  of  fitness 

Several  independent  illnesses  described  in  this  chapter  have  been  investigated  for  their  severi¬ 
ty,  health  consequences,  and  possible  prevention  through  physical  activity. The  overwhelming  find¬ 
ing  has  been  that  a  moderate  level  of  regular  physical  activity  has  a  very  beneficial  influence  in 
reducing  the  risk  of  all  these  illnesses  at  the  same  time.  This  further  begs  the  question  then,  what 
is  the  reduction  in  the  risk  of  all-cause  mortality  relative  to  fitness  or  physical  activity ?A  few  stud¬ 
ies  have  set  out  to  answer  this  question. 

The  first  investigation  studied  the  relationship  between  changes  in  physical  fitness  and  the  risk 
of  mortality  in  men.  A  group  of  researchers  (Blair  et  al.,  1995)  evaluated  this  relationship  through  a 
prospective  study  of  9,777  men.  The  subjects  were  examined  twice  (mean  of  4.9  years  between  ini¬ 
tial  and  follow-up  exams)  to  assess  changes  in  physical  fitness  and  then  again  assessed  for  mortality 
risk  (mean  5.1years  after  second  exam).  The  main  outcome  measures  were  all-cause  mortality  (n  = 
223)  and  cardiovascular  disease  (n  =  87)  mortality.  The  researchers  found  the  highest  age-adjusted, 
all-cause  death  rate  among  the  men  who  were  physically  unfit  at  the  time  of  both  examinations. The 
lowest  death  rate  was  among  those  who  were  physically  fit  at  the  time  of  both  examinations.  The 
men  who  improved  from  physically  unfit  to  fit  between  the  first  and  second  examination  had  a  44 
percent  reduction  in  mortality  risk  relative  to  men  who  remained  physically  unfit  between  the  two 
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exams.  There  was  an  inverse  correlation  between  the  improvements  in  fitness  and  mortality  risk 
Each  minute  increase  in  maximal  treadmill  time  (  level  of  fitness)  between  the  first  and  second  exams 
corresponded  with  a  7.9  percent  decrease  in  mortality  risk.  The  authors  conclude  that  men  who 
maintained  or  improved  their  fitness  level  were  significantly  less  likely  to  die  from  all  causes  and  car¬ 
diovascular  disease  during  the  follow-up  than  those  who  were  persistently  physically  unfit. 

In  another  investigation  (Lee  et  al.,  1999)  researchers  studied  the  independent  benefits  of  vig¬ 
orous  versus  nonvigorous  physical  activity  on  the  risk  of  all-cause  mortality.  They  investigated 
17,321  men  (mean  age  =  46  years)  who  were  free  of  self-reported  or  physician-diagnosed  cardio¬ 
vascular  disease,  cancer,  or  chronic  obstructive  pulmonary  disease.  The  men  answered  question¬ 
naires  concerning  their  physical  activities  at  their  baseline  screening.  During  22  follow-up  years 
there  were  3,728  deaths.  The  authors  reported  reductions  in  the  risk  of  all-cause  mortality  for  those 
who  exhibited  moderate  levels  of  total  energy  expenditure  and  energy  expenditure  from  vigorous 
activities. They  did  not  find  such  a  benefit  for  those  engaging  in  nonvigorous  activities. They  con¬ 
clude  that  there  is  a  graded,  inverse  relationship  between  total  physical  activity  and  mortality.  In 
addition,  they  found  that  vigorous,  but  not  nonvigorous  activities  were  associated  with  longevity. 
The  researchers  point  out  that  their  findings  only  pertain  to  all-cause  mortality  and  that  nonvigor¬ 
ous  exercise  has  clearly  been  shown  to  benefit  other  aspects  of  health. 

Conclusion 

Again,  more  detailed,  supporting  evidence  of  the  strong  association  between  specific  morbidities 
(and  mortality)  and  fitness  or  health  follows  in  Appendix  A.  All  of  these  investigations  have  gener¬ 
ally  sought  to  further  quantify  linkages  between  physical  activity  and  overall  health.  However,  as 
alluded  to  earlier,  attempting  to  determine  which  types  of  physical  activities  are  the  most  beneficial, 
how  long  and  how  often  they  really  should  be  performed,  and  the  specific  degree  to  which  isolated 
health  problems  may  be  prevented,  alleviated,  or  at  least  attenuated  by  such  activity,  is  very  compli¬ 
cated  because  many  of  these  variables  are  interrelated.  Furthermore,  in  practice,  varying  levels  of  dif¬ 
ficulty  may  be  encountered  when  further  specifying  the  dose-response  relationship,  depending  on 
the  chosen  component  of  fitness,  that  is,  aerobic,  strength,  body  composition.  This  is  primarily  due 
to  the  lack  of  sufficient  research  data  and/or  discrete  analytical  methodologies,  and  is  especially  true 
when  attempting  to  identify  clearly  defined  testing  cut-points  on  which  to  base  standards. 


Exercise  Prescription 


Since  there  are  many  lifestyle  and  environmental  influences,  as  well  as  genetic  factors  that  are 
all  related  to  overall  health  (refer  to  Figure  2.1),  it  is  difficult  to  isolate  specific  types  and  amounts 
of  activity  for  each  person  as  direct  predictors  of  optimized  wellness.  However,  using  the  available 
data  relating  to  physical  activity,  the  U.S.  Government  and  nationally  prominent  health  organiza¬ 
tions  have  constructed  general  activity  guidelines  to  reduce  the  prevalence  and  severity  of  disease 
among  the  population.  According  to  ajoint  recommendation  from  the  Centers  for  Disease  Control 
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and  Prevention  (CDC)  and  the  American  College  of  Sports  Medicine  (ACSM)  (Pate  et  al.,  1995), 
“'Every  U.S.  adult  should  accumulate  30  minutes  or  more  of  moderate-intensity  physical  activity  on 
most,  preferably  all,  days  of  the  week”  (p.  404). “Oneway  to  meet  this  standard  is  to  walk  two  miles 
briskly”  (p.  404).  Alternatively,  the  ACSM  has  a  more  detailed  and  arguably  more  rigorous  pre¬ 
scription  (Pollock  et  al.,  1998)  that  incorporates  cardiorespiratory  fitness,  muscular  strength  and 
endurance,  body  composition,  and  flexibility.  These  prescriptions  have  been  accepted  as  very  gen¬ 
eral  recommendations  to  the  American  public  for  achieving  and  maintaining  adequate  overall 
health,  and,  likely  at  least,  modest  improvement  in  physical  capacity  and  performance.  Other  exer¬ 
cise  prescription  approaches  are  described  later  in  this  chapter. 

When  choosing  an  activity  to  perform,  one  must  consider  his  or  her  existing  fitness  level.  This  ini¬ 
tial  baseline,  along  with  the  projected  duration,  are  determinants  of  the  level  of  exercise  intensity. 
However  the  total  volume  (frequency  X  duration)  is  probably  most  important  for  health  benefits. 
Exercise  adherence  or  consistency  seems  to  be  the  dominant  factor.  Moreover,  if  the  exercise  is  of  suf¬ 
ficient  duration,  a  lower  or  higher  intensity  will  not  alter  the  fact  that  health  benefits  would  be  gained. 
Although  further  improvements  in  cardiovascular  endurance  (physical  performance)  may  be 
observed,  only  a  minimal  threshold  of  activity  intensity  is  required  to  begin  to  reduce  disease  risk.  The 
quality  of  exercise  necessary  for  basic  health  is  therefore  based  on  its  frequency  and  duration  (volume) 
more  than  its  intensity.  Generally  speaking,  increased  duration,  frequency,  along  with  decreased  inten¬ 
sity,  would  tend  to  promote  health  over  performance  improvements  (Pollock  et  d.,  1998). 

As  described  earlier,  the  relationship  between  regular  physical  activity  and  health  has  been  stud¬ 
ied  extensively,  and  there  is  a  multitude  of  evidence  demonstrating  a  positive  correlation  (albeit 
nonlinear)  between  the  two  (U.S.  Department  of  Health  and  Human  Services,  2000).  Figure  2.2 
illustrates  the  dose-response  relationship — an  asymptotic  continuum — between  baseline  activity 
status  and  derived  benefit  from  physical  activity.  Clearly,  the  greatest  health  benefits  are  found  in 
those  people  who  start  at  the  lowest  fitness  levels  and  then  initiate  a  program  of  physical  activity. 
At  the  highest  levels  of  energy  expenditure,  there  is  diminishing  return  and  potentially  increased 
risk  for  injury,  diminished  immune  response,  malaise,  and  so  forth.  Further  confounding  the  pic¬ 
ture  is  the  high  degree  of  dose-response  variability  among  individuals.  From  the  perspective  of 
identifying  a  health-related  standard  or  specific  cut-point,  the  general  notion  of  a  continuum  is 
problematic.  In  other  words,  the  challenge  is  to  scientifically  define  a  threshold  of  physical  activity 
level  (mode,  volume,  and  intensity)  that  relates  to  a  minimum  desirable  level  of  health  and  fitness. 
Furthermore,  to  develop  specific  fitness  standards  that  are  useful  and  accurate,  a  clear  scientific 
rationale  must  be  supported.  In  this  chapter,  this  issue  is  most  germane  (and  difficult)  and  will  be 
considered  in  relation  to  a  health-based  standards  development  process. 
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Baseline  Activity  Status 

Low  - ^  High 

Figure  2.2  The  dose-response  curve  represents  that  best  estimate  of  the  relationship  between  physical 
activity  (dose)  and  health  benefit  (response).  The  lower  the  physical  activity  status  the  greater  will  be  the 
health  benefit  associated  with  a  given  increase  in  physical  activity  (arrows  A,  B,  and  C).  Copyright  1998 by 
the  American  Public  Health  Association. 


Programs  and  RecoMmendations  Regarding  Physical  Activity 


Health  and  fitness  in  the  United  States  have  received  national  attention  since  the  1970s.  Many 
national  mandates  have  been  enacted  since  that  decade,  with  increasing  emphasis  on  how  changes  in 
lifestyle  can  promote  health  and  longevity.  Most  of  the  early  emphasis  was  on  cardiovascularor  are- 
obic  exercise,  but  these  have  generally  evolved  into  more  well-rounded  physical  training  programs .  For 
example,  the  ACSM  added  specific  strength-training  guidance  as  late  as  1990. The  American  Heart 
Association  recommended  endurance  training  alone  in  1975  and  added  strength  training  in  1992. 
Most  programs  now  recommend  strength  training  in  addition  to  endurance  training,  and  flexibility 
training  is  also  recommended  in  about  a  third  of  the  programs.  Highlighted  here  are  excerpts  from 
the  ACSM  pronouncements,  and  recommendations  from  the  U.S.  Department  of  Health  and 
Human  Services’ Healthy  People  2010  recommendations,  the  Centers  for  Disease  Control  (CDC), 
the’American  Heart  Association  (AHA),  and  the  National  Institutes  of  Health  (NIH). 

American  College  of  Sports  Medicine  Position  Stand 

In  1998,  the  ACSM  published  a  Position  Stand  (Pollock,  1998)  titled,  The  Recommended 
Quantity  and  Quality  of  Exercise  for  Developing  and  Maintaining  Cardiorespiratory  and  Muscular 
Fitness,  and  Flexibility  in  Healthy  Adults.  The  committee’s  recommendations  were  as  follows — 
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Cardiorespiratory  Fitness  and  Body  Composition 

1.  Frequency  of  training — 3—5  days  per  week 

2.  Intensity  of  training — 55/65%— 90%  of  maximum  heart  rate,  or  40/50%— 85%  of  maximum 
oxygen-uptake  reserve  or  maximum  heart-rate  reserve 

3.  Duration  of  training — 20—60  minutes  of  continuous  or  intermittent  (minimum  of  10- 
minute  bouts  accumulated  throughout  the  day)  aerobic  activity.  Duration  depends  on  the 
intensity  of  the  activity. 

4.  Mode  of  activity — Any  activity  that  uses  large  muscles,  can  be  maintained  continuously, 
and  is  rhythmical  and  aerobic  in  nature. 

Muscular  Strength  and  Endurance,  Body  Composition,  and  Flexibility 

1.  Resistance  training  —  Resistancetraining  should  be  an  integral  part  of  an  adult  fitness  pro¬ 
gram  with  sufficient  intensity  to  enhance  strength,  muscular  endurance,  and  maintain  fat- 
free  mass.  Resistance  training  should  be  progressive,  individualized,  and  stimulate  all 
major  muscle  groups.  One  set  of  8—10  exercises  2-3  days  a  week  is  recommended,  with 
multiple  sets  providing  greater  benefit.  For  those  aged  50  years  and  older,  10—15  repeti¬ 
tions  may  be  more  appropriate. 

2.  Flexibility  training  —  Flexibility  exercises  should  be  incorporated  into  training  to  develop 
and  maintain  range  of  motion.  Exercises  should  stretch  major  muscle  groups  and  be  per¬ 
formed  at  least  2  or  3  times  a  week. 

The  committee  devised  these  recommendations  on  the  basis  of  an  extensive  survey  of  the 
health-and-fitness-related  literature,  and  compared  with  the  earlier  versions  of  the  Position  Stand, 
and  it  “hasbeen  pointed  out  that  the  quantity  and  quality  of  exercise  needed  to  attain  health-relat¬ 
ed  benefits  may  differ  from  what  is  recommended  for  fitness  benefits.  It  is  now  clear  that  lower  lev¬ 
els  of  physical  activity  (particularly  intensity)  than  recommended  by  this  Position  Stand  may  reduce 
the  risk  for  certain  chronic  degenerative  diseases  and  improve  metabolic  fitness  and  yet  may  not  be 
of  sufficient  quantity  or  quality  to  improve  VC^max”  (ACSM,  1990,p.  265—266). 

The  ACSM  views  physical  activity  for  health  and  fitness  in  the  context  of  a  continuum, 
acknowledging  that  many  health  benefits  are  accrued  by  going  from  a  sedentary  to  a  minimal  level 
of  physical  activity.  The  authors  note  that,  “Although  the  fitness  paradigm  that  is  recommended  in 
this  ACSM  Position  Stand  is  adaptable  to  a  broad  cross-section  of  the  healthy  adult  population,  it 
is  clearly  designed  for  the  middle-to-higher  end  of  the  exercise/physical  activity  continuum.” 


lecommendations  r  Healthy  P  t  ] 

The  U.S.  Department  of  Health  and  Human  Services  (2000)  published  the  goals  relating  to 
physical  activity  that  were  established  for  the  U.S.  population  for  the  year  2010.  A  large  commit¬ 
tee  of  the  country’s  leading  health  experts  established  these  goals  in  addition  to  many  other  goals 
relating  to  the  quality  of  life  for  Americans.  In  order  to  focus  on  the  key  issues,  they  established  the 
“Leading  Health  Indicators”  (LHI)  for  each  goal,  including  physical  activity.  The  following  is  an 
excerpt  from  the  LHI  for  physical  activity — 
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Regular  and  sustainedphysical  activity  has  documented  beneficial  effects  on  cardiovascular  func¬ 
tioning  (e.g.,  reducing  hypertension  and  hypercholesterolemia)  but  also  on  theprevention  <f  osteo¬ 
porosis  and  its  sequelae  (e.g.,  hip  factures),  the  effects  of  osteoarthritis,  and  on  such  mental  condi¬ 
tions  as  depression.  Physical  activity  is  also  an  important  element  <f  weight  control.  This  indicator 

addresses  physical  activity  across  the  spectrum Although  vigorous  physical  activity  is  associated 

with  a  decreased  risk  of  cardiovascular  disease,  hypertension,  and  some  cancers,  agrowing  body  i f 
literature  indicates  that  more  moderate  levels  of  activity  can  be  beneficial  to  a  persons  health. 
Furthermore,  the  general  population  and  diverse  population  groups  are  more  likely  toparticipate 
in  moderate  levels  ofphysical activity  (p.  37). 

This  LHI  for  physical  activity  also  calls  out  the  following  prescriptions — 

Increase  theproportion  of  adolescents  who  engage  in  vigorous  physical  activity  thatpromotes  car¬ 
diorespiratory fitness  3  or  more  daysper  week  for  20  or  more  minutesper  occasion.  Increase  thepro¬ 
portion  of  adults  who  engage  regularly,  preferably  daily,  in  moderatephysical  activity  for  at  least 
30  minutesper  day  (p.  37). 

These  suggestions  are  based  on  the  finding  that  only  15  percent  of  adults  engaged  in  the  rec¬ 
ommended  amount  of  physical  activity  in  1997,  and  40  percent  of  adults  did  not  engage  in  any 
leisure-time  physical  activity. That  same  year  only  64  percent  of  adolescents  engaged  in  the  recom¬ 
mended  amount  of  physical  activity.  The  experts  involved  in  Healthy  People  2010  declare  the  fol¬ 
lowing  health  benefits  from  regular  physical  activity — 

Increased  muscle  bone  and  strength 
Increased  lean  muscle  and  decreased  bodyfat 

Aided  in  weight  control  and  was  a  key  part  of  any  weight-loss  effort 
•  Enhancedpsychological  well-being  and  even  reduced  the  risk  of  developing  depression 
Reduced  symptoms  of  depression  and  anxiety  and  improved  mood  (p.  37) 

In  addition,  they  report  that,  “The  major  barriers  most  people  face  when  trying  to  increase 
physical  activity  are  lack  of  time,  access  to  convenient  facilities,  and  safe  environments  in  which  to 
be  active” (U.S.  Department  of  Health  and  Human  Services, 2000,  p.  37). 


r  !■  i  ti  from  CD  flCSM 

The  Centers  for  Disease  Control  (CDC)  and  the  American  College  of  Sports  Medicine 
( ACSM)  collaborated  on  recommendations  for  physical  activity  and  public  health.  They  suggested 
that  the  current  low  rate  of  participation  in  regular  physical  activity  may  be  due  in  part  to  the  mis¬ 
perception  that  to  benefit  one  must  engage  in  vigorous  continuous  exercise. The  scientific  evidence 
clearly  demonstrates  that  regular,  moderately  intense  physical  activity  provides  equal  and  substan¬ 
tial  health  benefits.  Further,  CDC  and  ACSM  suggest  that  every  U.S.  adult  should  accumulate  30 
minutes  or  more  of  moderately  intense  physical  activity  on  most,  preferably  all,  days  of  the  week. 
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Intermittent  activity  also  confers  substantial  benefits.  Therefore,  the  recommended  30  minutes  of 
activity  can  be  accumulated  in  short  bouts  of  activity:  walking  up  the  stairs  instead  of  taking  the 
elevator,  walking  instead  of  driving  short  distances,  doing  calisthenics,  or  pedaling  a  stationary  cycle 
while  watching  television.  The  health  benefits  gained  from  increased  physical  activity  depend  on 
the  initial  physical  activity  level.  Sedentary  individuals  are  expected  to  benefit  the  most  from 
increasing  their  activity  to  the  recommended  level. 

Both  CDC  and  ACSM  also  identified  two  other  components  of  fitness  here — flexibility  and 
muscular  strength — which  should  not  be  overlooked.  Clinical  experience  and  limited  studies  sug¬ 
gest  that  people  who  maintain  or  improve  their  strength  and  flexibility  may  be  better  able  to  avoid 
disability,  especially  as  they  advance  into  older  age.  An  active  lifestyle  does  not  require  a  regiment¬ 
ed,  vigorous  exercise  program.  Instead,  small  changes  that  increase  daily  physical  activity  vSl  enable 
people  to  reduce  their  risk  of  chronic  disease  and  may  contribute  to  an  enhanced  quality  of  life. 
Finally,  if  Americans  who  lead  sedentary  hves  would  adopt  a  more  active  lifestyle,  an  enormous 
benefit  to  the  public’s  health  and  to  individual  well-being  would  result. 

Recommendations  from  the  American  Heart  Association 

The  American  Heart  Association  (AHA)  presented  a  statement  on  exercise  describing  physi¬ 
cal  activity  and  its  benefits  on  health.  They  suggested  that,  “Persons  of  all  ages  should  include  phys¬ 
ical  activity  in  a  comprehensive  program  of  health  promotion  and  disease  prevention  and  should 
increase  their  habitual  physical  activity  to  a  level  appropriate  to  their  capacities,  needs,  and  interest 
(AHA,  1997,  p.  4).  AHA  goes  on  to  explain  that — 

For  healthpromotion,  dynamic  exercise  if  the  large  muscles  for  extendedperiods  <f  time  (3Oto60 
minutes,  three  to  six  times  weekly)  is  recommended.  This  may  include  short  periods  A  moderate- 
intensity  (60%  to  75%of  maximum  capacity )  activity  (approximately 5  to  10  minutes)  that  total 
30  minutes  on  most  days.  Resistance  training  using  8  to  10  different  exercise  sets  with  10  to  15 
repetitions  each  (arms,  shoulders,  chest,  trunk,  back,  hips,  and  legs  (performed  at  a  moderate  to  high 
intensity  (for  example,  10tol5  pounds  of  free  weight)for  a  minimum  of  two  days  per  week  is  rec¬ 
ommended.  (US  Department  A  Health  and  Human  Services,  1996, p.  4) 

Recommendations  from  the  National  Institutes  of  Health 

The  National  Institutes  of  Health  (NIH)  developed  a  consensus  statement,  prepared  by  non¬ 
advocate,  non-Federal  experts  (NIH,  1996).  It  was  based  on  the  results  of  a  conference  of  these 
experts  in  which  investigators  reported  their  findings  regarding  health,  conducted  question-and- 
answer  sessions,  and  engaged  in  closed  deliberations.  Their  consensus  was  as  follows 

All  Americans  should  engage  in  regular  physical  activity  at  a  level  appropriate  to  their  capacity, 
needs,  and  interest.  Children  and  adults  alike  should  set  a  goal  A  accumulating  at  least  30  minutes 
<f  moderate-intensity  physical  activity  on  most,  and  preferably,  all  days  <f  the  week.  Most 
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Americans  have  little  or  no  physical  activity  in  their  daily  lives,  and  accumulating  evidence  indi¬ 
cates  that  physical  inactivity  is  a  major  riskfactor  for  cardiovascular  disease.  Howe\’er,  moderate 
levels  <S  physical  activity  confer  significant  health  benefits.  Even  those  who  currently  meet  these 
daily  standards  may  derive  additional  health  and  fitness  benefits  by  becoming  more  physically 
active  or  including  more  vigorous  activity.  For  those  with  known  cardiovascular  disease,  cardiac 
rehabilitation  programs  that  combine  physical  activity  with  reduction  in  other  riskfactors  should 
be  more  widely  used  ( NIH ,  1996,  p.  41  cited  in  U.S.  Health  and  Human  Sendees) 


Comparison  of  the  Recommendations  of  National  Fitness  and 
Health  Groups 

A  more  rigorous  history  and  review  of  physical  activity  recommendations  has  been  accomplished 
in  the  Surgeon  General’s  report  on  Physical  Activity  and  Health  (U.S.  Department  of  Health  and 
Human  Services,  1996).  It  emphasizes  that  people  can  increase  their  physical  activity  in  many  ways. 
It  is  acknowledged  that  both  the  more  structured  (ACSM  Position  Stand)  and  lifestyle 
(CDC/ACSM)  approaches  can  work  for  the  relatively  sedentary  person.  Significant  points  of  agree¬ 
ment  across  the  board  are  as  follows 

•  Most  persons  should  accumulate  at  least  30  minutes  of  moderate,  aerobic  activity  most  days 
of  the  week, 

•  Additional  health  and  func  tionalbenefits  come  with  increased  exercise  intensity  and/ or  volume, 

•  Resistance-training  programs  should  be  accomplished  at  least  twice  a  week,  involving  8  to 
12  major  muscle  group  exercises  and  repetitions  each  (multiple  sets  are  not  required). 

However,  some  may  still  view  this  general  guidance  as  too  liberal  or  seemingly  conflicting.  This 
seems  especially  true  with  regard  to  the  issue  of  intermittently  (versus  contiuously)  performed  exercise. 

As  indicated  earlier,  in  addition  to  moderately  intense  activity,  recent  consensus  physical-activi¬ 
ty  guidelines  include  recommendations  that  allow  for  accumulating  moderately  intense  physical 
activity  over  a  24-hour  period  (ACSM-CDC  in  theJournal  of the  American  Medical  Association,  Pate 
1995). These  recommendations  are  based  on  evidence  sugggesting  that  comparable  health  and  fit¬ 
ness  benefits  occur  as  long  as  the  total  amount  of  energy  expended  in  moderately  intense  physical 
activity  accumulated  over  the  course  of  each  day  achieves  the  recommended  levels,  at  least  if  the  indi¬ 
vidual  bouts  are  more  than  a  few  minutes  (Pate  et  al.,  1995).  In  many  of  the  studies  demonstrating 
a  strong,  inverse  associationbetween  the  level  of  physical  activity  or  fitness  and  all-cause  and  cause- 
specific  morbidity  and  mortality,  the  level  of  activity  has  been  moderately  intense  and  sometimes 
performed  intermittently  (Haskell,  1994;  Leon  et  al.,  1987;  Paffengarger  et  al.,  1986;  Pate  et  al., 
1995;U.S.  Department  of  Health  and  Human  Services,  1996).  Epidemiological  data  are  supported 
by  clinical  studies  comparing  longer  (traditionally  30  minutes  of  continuous  activity)  versus  shorter 
(5  to  15  minutes)  bouts  of  activities  spread  throughout  the  day  (DeBusk  et  al.,  1990;Jakicic  et  al., 
1999).  These  studies  reveal  that  comparable  gains  in  cardiorespiratory  fitness  and  various  health 
measures  occur  with  intermittent  bouts  of  physical  activity  when  the  total  amount  of  exercise  is  the 
same.  Although  more  research  remains  to  be  done,  the  current  consensus  among  major  groups  mak- 
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ing  physical  activity  recommendations  is  that  intermittent  activity  is  beneficial.  In  addition  to  com¬ 
parable  health  and  fitness  benefits,  multiple  short  bouts  of  physical  activity  may  increase  participa¬ 
tion  and  adherence  (Jakicic  et  al.,  1999).  Clearly,  the  ACSM  Position  Stand  is  targeted  at  the  mid¬ 
dle  to  higher  end  of  the  physical  activity  training  program  continuum. 


The  Dose-Response  Relationship  and  Test  Outcomes 


’  Everybody  agrees  that  combat  troops  should  be  physically  fit,  but  can  we  be  more  specific  about  the 

requirement?  ” (Ramsey,  Mr  Force  Magazine,  1990) 

The  issue  of  dose-response  might  be  viewed  as  the  description  of  how  much  of  what  type  of 
physical  activity  is  required  to  achieve  specific  performance  or  health-related  outcomes.  Obviously, 
there  is  some  modest  disagreement  among  groups  as  to  the  amount  of  physical  activity  required  for 
good  health.  There  is  consensus  that  regular  physical  activity  is  apart  of  a  healthy  lifestyle  (which 
is  filly  compatible  with  Military  readiness),  and  this  habitual  activity  contributes  to  both  health 
and  fitness  (Institute  of  Medicine,  1998).  Maximum  requirements  of  somejobs  would  require  more 
rigorous  physical  training  programs  (a  little  more  is  almost  always  better).  Remembering  that  the 
regularity  of  the  activities,  whatever  they  maybe,  is  ofupmost  importance.  On  the  other  hand,  gov¬ 
ernmental  physical  fitness  programs  have  tended  to  focus  on  test  outcomes  (part  of  the  response). 
Interestingly,  it  has  been  noted  that  simply  enforcing  past  Military  fitness  programs  has  not 
achieved  an  improvement  in  the  overall  fitness  levels  (Institute  of  Medicine,  1998).  It  can  also  be 
argued  that  these  testing  modalities  are  not  truly  “job-related”  to  any  reasonable  extent. 

Although  there  are  many  health  benefits  that  may  increase  in  a  dose-response  manner  with 
exercise,  there  is  presumably  a  point  at  which  exercise  no  longer  provides  increasing  benefits  and 
may  even  be  associated  with  increasing  injury  risks.  The  Surgeon  General’s  Report  (1996)  indicat¬ 
ed  that  improved  health  and  disease  prevention  is  associated  with  the  recommended  30  minutes  of 
accumulated  activity  on  most  days  and  increasing  that  activity  level  to  some  critical  point  can  pro¬ 
vide  further  benefits.  Although  this  point  is  debated,  it  is  hypothesized  that  the  greatest  benefits 
are  seen  between  30  and  60  minutes  of  exercise  and  that  beyond  60  minutes  the  benefits  plateau 
and  the  risks  increase.  In  addition  to  the  risk  of  decreased  compliance  due  to  minor  injury  or  dis¬ 
comfort,  there  is  the  concern  that  overindulging  in  strenuous  exercise  may  result  in  catastrophic 
conditions  like  a  heart  attack. 

The  specific  dose  parameters  (volume,  intensity,  and  modality)  associated  with  health-related 
responses  may  be  quite  different  than  those  typically  identified  when  the  goal  is  physical  perform¬ 
ance  improvement  (Haskell,  1994).  Further,  exercise  prescription  may  be  based  on  either  a  person’s 
relative  capacity  or  on  an  absolute  intensity.  In  addition,  a  variety  of  personal  characteristics  should 
influence  the  dose-response  relationship  for  any  specific  outcome.  These  characteristics  would 
include  age,  gender,  health  status,  health  habits,  and  fitness  or  activity  levels.  Perhaps  the  major 
consideration  here  is  the  quite  variable  inter-individual  response  relative  to  very  similar  exercise 
prescriptions  (or  doses).  Figure  2.3,  with  data  from  Dionne,  Thibault,  and  Lucie  (1991)  shows  the 
varied  response  of  young  men  to  a  highly  standardized  aerobic  training  program.  Figure  2.3  plots 
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change,  after  training,  in  maximal  oxygen  uptake  at  onset  of  blood  lactic  acid  across  29  subjects. 
The  range  of  improvement  in  maximal  oxygen  was  greater  than  a  tenfold  difference.  Finally,  most 
of  the  work  in  the  area  of  aerobic  (versus  strength)  training  has  focused  on  the  change  in  aerobic 
capacity,  not  health-related  biological  changes.  In  fact,  with  the  exception  of  outcomes  for  reduced 
adiposity  or  insulin  resistance,  very  little  is  known  about  the  degree  of  health  improvement  (or 
response)  to  a  specific  dose  of  activity  (Haskell,  1994).  Further  confounding  the  picture  is  the  lack 
of  linearity  in  responses  to  training  and  detraining  as  well  as  the  larger  distinctions  between  the 
acute  and  accumulated  outcomes. 


1.0-1 


Individual  Subjects  in  Endurance 
Exercise  Training  Program 


Figure  2.3  Change  in  maximal  oxygen  uptake  at  onset  of  blood  lactic  acid  for  subjects  after  12  weeks  of 
endurance  training.  Reprinted,  by  permission,  from  Dionne  et  al,  Medicine  and  Science  in  Sports  and 
Exercise.  The  varied  response  of  young  men  to  a  highly  standardized  aerobic  training  programs.  1991, 21, 
177.  Lippincot,  Williams,  &  Wilkins:  MD. 

Another  way  to  view  this  issue  may  be  from  a  more  qualitative  approach  optimal  versus  adequate 
versus  minimal'  doses.  An  adequate  exercise  dosage,  which  is  described  as  the  level  at  or  above  the 
threshold  for  the  desired  physiological  change,  may  be  more  easily  identified  in  the  literature.  On  the 
other  hand,  very  little  is  currently  known  about  the  optimal  or  minimal  dosages  of  exercise  needed  for 
the  desired  effects.  Fortunately,  the  available  knowledge  is  still  adequate  for  use  in  exercise  prescription 
settings  directed  at  the  general  improvement  of  health-related  outcomes  ( Haskell,  1 994).  Again,  there 
is  generally  a  positive  dose-related  versus  health-outcome  relationship  across  the  entire  range  of  exer- 
use  prescriptions,  that  is,  more  exercise  is  better  (up  to  a  point).  Moreover,  the  greatest  health  benefits 
are  garnered  by  the  minimally  fit  after  subscribing  to  at  least  a  modest  physical  fitness  program. 
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Finally,  the  amount  of  physical  activity  that  corresponds  to  low,  moderate,  and  high  levels  of 
cardiorespiratoryfitnesswas  explored  by  Stofan  et  al.,  1998. They  used  questionnaires  administered 
to  a  clinical  population  to  inquire  about  the  energy  expenditure  of  13,444  men  and  3,972  women 
aged  20  to  87.  The  individual  fitness  levels  were  assessed  by  a  maximal  exercise  treadmill  test,  and 
then  compared  with  the  reported  individual  energy  expenditures.  Average  leisure  time  energy 
expenditures  of  525  to  1 ,650  kcal/week  for  men,  and  420  to  1 ,260  kcal/week  for  women  were  asso¬ 
ciated  with  moderate  to  high  levels  of  fitness.  Such  energy  expenditure  can  be  achieved  with  a  brisk 
walk  of  approximately  30  minutes  duration  most  days  of  the  week.  The  authors  concluded  that 
most  people  should  be  able  to  achieve  these  physical  fitness  levels.  Subsequently,  such  activity  will 
improve  cardiorespiratory  f  itness  enough  to  result  in  substantial  health  benefits. 


Physical  Activity  Versus  Test  Outcomes/Standards 

It  has  been  suggested  that  by  subscribing  to  the  ACSM  guidelines,  a  consistently  ready,  fit,  and 
healthy  force  would  be  maintained  (Institute  of  Medicine.  1998).  Although  this  may  appear  as  tele¬ 
ologically  acceptable,  it  is  primarily  a  quantification  of  the  dose  and  not  a  firm  metric  for  a  measura¬ 
ble  outcome.  Therefore,  it  definitely  falls  short  as  a  testable  fitness-related  standard. This  is  a  very 
important  point  or  hurdle  in  the  physical  fitness  standards  development  process.  O  n  the  other  hand, 
just  passing  a  medical  examination,  that  is,  presenting  with  no  pathologies  (metabolic  fitness),  does 
not  ensure  compliance  with  a  program  of  regular  physical  activity.In  essence,  passing  a  medical  screen 
is  the  first  (and  often  only)  selection  or  retention  standard  to  qualify  for  ajob  or  career  entrance. 


Specific  Approaches  to  Setting  Cut-points  or  Standards 


Indeed  the  literature  is  very  sparse  with  regard  to  methodologies  and  databases  on  which  one 
can  base  cardiovascular  standards.  We  will  identify  those  specific  efforts  to  define  specific  health- 
based  fitness  standards. 


Aerobic 

Cooper  Institute  Aerobics  Center  Longitudinal  Study:  Health-Related  Fitness — In  this  study, 
current  U.S.  Air  Force  minimum  fitness  standards  (see  Chapter  lin  this  SOAR)  were  evaluated 
against  data  from  a  large  cohort  of  men  and  women  from  the  Cooper  Institute  Aerobics  Center 
Longitudinal  Study  (ACLS)  to  determine  the  appropriateness  of  these  standards  as  a  criterion 
measure  of  health-related  fitness  (versus  task-related  fitness  standards). 

The  ACLS  cohort  had  accumulated  about  460,000  person-years  of  follow-up  with  aerobic  fit¬ 
ness  assessment  among  some  41,000  men  and  women.  The  ACLS  study  revealed  approximately 
1,1 00  deaths.  Although  this  particular  type  of  analysis  had  not  been  done  before,  it  is  noteworthy 
that  this  cohort  is  comparableto  many  population-survey  samples  with  respect  to  fitness  levels. The 
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ACLS  cohort  served  to  determine,  for  both  sexes  and  several  age  groups,  standards  of  aerobic  fit¬ 
ness.  Anyone  who  attains  the  standard  would  have  no  more  than  a  50  percent  greater  death  rate 
than  the  general  population  of  the  same  age,  gender,  and  risk  profile.  For  example,  if  the  mortality 
risk  in  a  particular  age  and  gender  category  of  the  general  population  is  2  deaths  per  1 ,000  persons 
per  year,  the  mortality  risk  for  anyone  of  that  same  age  and  gender  who  meets  the  minimal  fitness 
standard  would  be  at  most  3  deaths  per  1 ,000  persons  per  year.  The  50  percent  cut-point  is  com¬ 
parable  to  that  for  elevated  death  rates  due  to  cigarette  smoking,  high  blood  pressure,  and  high 
semm  cholesterol.  These  ACLS-based  health-related  cut-points  (standards)  are  presented  inTable 
2.2.  These  standards  are  based  on  what  is  necessary  to  avoid  a  50  percent  greater  mortality  risk.  The 
minimally  fit  cut-point  (bottom  quintile)  used  in  previous  studies  by  these  researchers  is  a  bit  high¬ 
er  than  this,  but  this  cut-point  identifies  a  group  that  is  at  somewhat  higher  risk.  Due  to  the  lim¬ 
ited  numbers  of  deaths  occurring  at  younger  ages,  the  categories  presented  are  necessarily  narrow¬ 
er  than  those  for  the  current  U.S.  Air  Force  minimal  fitness  standards.  Unfortunately,  a  similar  ana¬ 
lytical  approach  could  not  be  accomplished  relative  to  all-cause  morbidity. 


Table  2.2  Aerobics  Center  longitudinal  study  health-related  fitness  standards 


j  Age  (Years) 

Males'  (ml/kg/min) 

Age  (Years) 

Females*  (ml/kg/min)  1 

<40 

30.2 

|  (insufficientdata  here)  | 

40-49 

29.3 

<50 

27.8  I 

50-59 

27.6 

50-59 

21.2  1 

*  =  V02max 


These  preliminary  analyses  suggest  that  current  US.  Air  Force  minimum  fitness  standards  are 
sufficient  to  promote  health-related  fitness,  at  least  for  males.  The  more  limited  findings  for 
females  imply  that  comparable  U.S.  Air  Force  standards  may  be  too  low  for  most  women  (i.e., 
under  the  age  of  50).  Generally  speaking,  persons  who  are  physically  active  at  the  levels  recom¬ 
mended  by  the  recent  consensus  public  health  recommendations  would  be  highly  likely  to  achieve 
the  current  U.S.  Air  Force  minimum  fitness  standards. 

Hoeger  and  Hoeger  (1998)  have  attempted  to  identify  aerobic  health-fitness  standards  on  the 
basis  of  “'epidemiological  data  linking  minimum  fitness  values  to  disease  prevention  and  health”  (p. 
7).  The  specific  procedure  used  here  simply  identified  the  large  decrease  in  mortalities  observed  by 
Blair  et  al.  (1989)  between  the  first  (lowest)  and  combined  second  (and  third)  fitness  quintiles  as 
the  “‘health  threshold.”  These  “cut-points” related  to  values  of  maximum  oxygen  uptake  of  35  and 
32.5  ml.  kg  min  for  men  and  women,  respectively.  The  authors  suggest  that  regular  exercise  at 
approximately  50  percent  of  HRmax  capacity  should  accomplish  this  training  effect  and  qualita¬ 
tively  describe  this  level  as  “average.” 


Muscle  Strength  and  Endurance 

An  upper  BMI  limit  of  25  to  27  presents  a  fairly  narrow  range  on  which  to  focus  as  a  cut-point 
or  standard.  Interestingly,  the  Institute  of  Medicine  went  on  to  recommend  that  a  range  of  25  to  30 
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be  considered  as  a  “cautioh’zone,  with  further  Military  disposition,  depending  on  the  outcome  of  the 
Service  member’s  physical  fitness  testing.  The  application  of  a  health-related  body  composition  stan¬ 
dard  here  certainly  has  scientific  support.  It  is  also  noteworthy  that  others  (Hodgdon,  1998)  have  pre¬ 
viously  supported  basing  body  composition  standards  on  health  considerations  because  of  the  lack  of 
any  scientific  basis  for  using  either  performance  or  appearance  criteria. 

In-depth  literature  reviews  and  analyses  have  recently  been  completed  by  the  Human  Systems 
Information  and  Analysis  Center.  The  topics  included  muscular  strength  and  muscular  endurance 
relative  to  body  composition  (Palmer,  Rench,  Carroll,  &Constable,  2000). In  all,  no  findings  ofpub- 
lished  test  metrics  or  standards  were  specifically  linked  to  health-related  levels  of  fitness.  The  excep¬ 
tion  to  this  observation  might  be  found  in  Hoeger  and  Hoeger  who  at  least  notionally  identified 
health  fitness  standards  for  muscular  strength  and  endurance.  However,  on  closer  review  these 
appear  to  have  been  chosen  from  a  normative  standpoint  of  around  the  5 O'1’  percentiles. 

Body  Composition 

As  noted,  these  standards  of  health-related  criteria  may  not  be  so  simple.  Perhaps  the  approach 
of  establishing  a  health-related  fitness  standard  lends  itself  better  to  the  fitness-related  parameter 
of  body  composition.  A  plethora  of  information  strongly  reveals  the  distinct  relationship  between 
BMI  (body  mass  index)  and  health  as  a  classic  J-shaped  curve.  It  is  most  desirable  to  maintain  a 
BMI  (wt/ht2)  of  between  19  and  25  (kg/m2),  as  both  relative  underweight  and  overweight  are 
accompanied  by  impaired  physical  performance  and  increased  risk  of  morbidity  and  mortality 
(Institute  of  Medicine,  1998).  However,  in  the  past  most  of  the  interest  focused  on  high  BMIs.  It 
might  also  be  argued  that  increasing  an  upper  BMI  limit  to  26—27  would  not  necessarily  increase 
all-cause  mortality  risks. 


Further  Military  Relevance 

Readiness  and  Fitness 

“Tliehumanf actor  in  readiness  and  warfare  always  has  determined  the  end  results  of  hostile  con¬ 
flict.-  ("Total  Force,  Total  Health,”  an  article  in  Leading  Edge,  1999,  Air  Force  Materiel 
Command's  monthly  publication ) 

The  number  one  component  in  readiness  of  our  Military  has  been  identified  as  qualified  peo¬ 
ple  (Armed  Forces  Journal  International,  January  1999).  Herrold  (Institute  of  Medicine,  1998) 
defined  readiness  for  a  Military  mission  as  maximizing  performance,  minimizing  unplanned  loss¬ 
es,  and  adapting  to  changing  environments.  For  the  sake  of  this  discussion,  “readiness”will  be  con¬ 
sidered  the  general  preparedness  and  fitness  necessary  for  a  person  to  perform  more  than  just  basic 
Military  tasks  as  determined  by  physical  exercises  such  as  push-ups,  sit-ups,  and  running. 
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Interestingly,  the  Military  Services  have  on  occasion  referred  to  these  physical  fitness  measures  as 
their  physical  readiness  test  (PRT)  (Institute  of  Medicine,  1998).  The  ACSM  Fitness  Book 
(American  College  of  Sports  Medicine,  1992)  describes  health-related  fitness  as  the  ability  to  carry 
out  daily  task  and  unexpected  bodily  challenges  with  a  minimum  of  fatigue  and  discomfort,  or  hav¬ 
ing  the  reserve  to  do  all  you  want  and  more.  Perhaps  this  could  be  described  as  a  state  of  civilian 
readiness.  Therefore,  Military  Services  need  to  be  a  fit  and  ‘'ready” force  so  that — 

Personnel  will  perform  well  under  deployment  or  emergency  conditions 

•  Everydayjobs  can  be  undertaken  safely  and  efficiently 

*  Costs  due  to  absenteeism  and  medical  problems  will  be  minimized. 

More  specifically,  cognitive  performance  as  well  as  physical  performance  may  be  enhanced 
under  emergency  conditions  if  personnel  are  physically  fit.  A  U.S.  Army  report  (Pleban,  Thomas, 
&Thompson,  1985)  found  that  the  more  physically  fit  soldiers,  as  assessed  on  a  battery  consisting 
of  chin-ups,  push-ups,  sit-ups,  two-mile  run  time,  and  pulse  rate  by  the  Harvard  Step  Test,  per¬ 
formed  better  on  a  cognitive  test  battery  and  had  lower  fatigue  ratings  during  a  two-and-a-half  day 
Ranger-type  sustained  operations  simulation. 

ob  Specific  Fitness  Requirements 

At  first  glance  it  might  appear  that  the  distinctions  among  different  wartime  missions  specific  to 
the  various  Services  would  enable  them  to  be  held  to  different  fitness  standards.  For  example,  most 
U.S.  Air  Force  troops  would  not  be  performing  the  kinds  of  more  physical  tasks  that  infantry  bat¬ 
talions  do.  However,  despite  the  emphasis  by  some  on  the  pilot  warfighter,  many  of  the  occupations 
within  the  U.S.  Air  Force  performed  on  an  ongoing  basis  as  well  as  those  that  might  be  needed  dur¬ 
ing  deployment  do  require  physical  labor.  Some  units  will  be  fit  enough  for  deployment  because  of 
their  everyday  Military  jobs  such  as  civil  engineering  or  maintenance  units  who  lift  and  load  every 
day.  But  also  consider  medical  personnel  who,  when  deployed,  must  carry  heavy  medical  equipment 
as  well  as  personal  gear  as  they  get  on  and  off  the  aircraft.  When  on  the  ground,  medical  personnel 
must  then  be  prepared  to  erect  tents  and,  later,  to  transport  patients  on  gurneys.  Medical  personnel, 
whose  daily  tasks  may  not  be  physically  demanding,  may  thus  be  faced  with  a  sudden  demand  to 
perform  physical  labor.  They  will  not  be  prepared  for  physical  labor  by  their  daily  work  tasks.  Only 
fitness  training  outside  of  the  work  area  can  prepare  them  sufficiently  for  deployment. 

It  could  be  further  argued  that  the  mission  of  any  Military  Service  is  to  maintain  a  level  of  fit¬ 
ness  year-round  that  will  enable  all  personnel  to  perform  any  possible  deployment  task  without 
fatigue  under  harsh  environmental  conditions  when  time  and  other  stressors  provide  additional  tax¬ 
ing  of  resources.  If  there  is  to  be  just  one  fitness  standard  for  a  Military  Service  member,  should 
that  standard  represent  the  degree  of  fitness  required  to  do  the  most  difficult  task  to  be  found  dur¬ 
ing  deployment,  under  the  most  rigorous  environmental  and  stress  pressures  imaginable? The  var¬ 
ious  difficulties  of  instituting  job-specific  performance  tests  in  the  Military  have  been  described  not 
only  in  this  document  but  elsewhere  (Institute  of  Medicine,  1998).  The  attempt  to  institute  occu¬ 
pation-related  performance  tests  in  the  Military  date  back  to  the  U.S.  Army  Air  Corps’  programs 
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during  WWII  (Institute  of  Medicine  1998).  There  are  major  concerns  regarding  this  other  alter¬ 
native  of  job-specific  physical  performance  tests/ standards,  for  example,  the  potentially  large  num¬ 
ber  of  tests  the  Military  Services  would  be  required  to  devise  and  administer,  the  frequency  with 
which  people  are  assigned  to  new  occupations  and/or  promoted  (Institute  of  Medicine,  1998), 
along  with  the  potential  lack  of  test  sensitivity  and  specificity. 

Moreover,  performance  on  Military  fitness  tests  does  not  correlatewell  with  performance  on  task- 
specificjob  tests  or  even  strength  tests  required  for  those  Military  vocations  that  demand  at  least  mod¬ 
erately  heavy  lifting  or  carrying  capacities  (Institute  of  Medicine,  1998).  A  general  theme  throughout 
this  SOAR  is  the  goal  of  developing  validated  occupational  performance  tests,  which  is  a  complex 
process  in  and  ofitself.The  only  feasible  approach  to  this  for  the  Military  would  be  to  “group’jobs  into 
testing  categories. This  approach,  however,  would  tend  to  further  confound  the  process  in  most  cases. 


U.S.  Air  Force  Direction 

The  Military  environment  may  present  some  additional  challenges  as  well  as  unique  opportu¬ 
nities.  Although  the  Military  recognizes  the  importance  of  adequate  physical  fitness  for  its  mem¬ 
bers,  it  could  be  argued  that  a  generic  test  to  ensure  full  Military  readiness  and  mission  success  is 
currently  not  practical  or  even  feasible.  Therefore  the  U.S,  Air  Force  is  exploring  a  two-tier 
approach  to  establishing  physical  fitness  standards  for  its  members.  Figure  2.4  and  2.5  are  notion¬ 
al  depictions  of  the  two  types  of  standards.  The  objective  of  Tier  I  would  primarily  be  health- 
based/general  readiness  fitness,  with  programs  and  standards  that  apply  to  all  U.S.  Air  Force  per¬ 
sonnel.  These  standards  would  be  gender-dependent  to  account  for  the  physiological  differences 
between  men  and  women.  Personnel  must  meet  these  threshold  values  to  signify  a  health-related 
level  of  fitness  above  which  distinct  health  benefits  would  be  realized  and  identified.  However,  a 
person  whose  levels  were  below  these  standards  would  be  susceptible  to  the  increased  risk  of  injury 
and  disease,  and  decreased  readiness,  ability  to  deal  with  stressors,  and  cognitive  capabilities. 

Tier  II  of  this  U.S.  Air  Force  approach  would  focus  on  an  occupation-specific, performance- 
based  fitness  program  that  would  further  enhance  mission  readiness  and  accomplishment. 
Performance-based  standards  are  gender  independent  with  thresholds  based  on  occupational 
requirements.  These  thresholds  would  represent  each  level  of  physical  fitness  necessary  for  person¬ 
nel  to  meet  the  physical  requirements  of  their  U,S,  Air  Force  Specialty  Code  (AFSC).  Inability  to 
meet  these  standards  would  imply  an  increased  risk  of  mission-specific  failure  and  greater  physi¬ 
cal  fatigue.  Therefore,  the  Tier  II  approach  might  be  considered  an  outgrowth  of  the  U.S.  Air 
Force’s  more  limited  Strength  Aptitude  Test  (see  Chapter  1),  which  is  currently  employed. 

aranc  in  the  Military  Context 

A  healthy  physical  appearance  has  long  been  valued  by  the  U.S.  Military  establishments.  The 
1 992  book.  Body  Composition  and  Physical  Performance  (Institute  of  Medicine,  1992)  indicates  that 
the  “appearance”rationale  for  body  composition  standards  does  not  have  a  substantial  relationship 
to  performance,  fitness,  nutrition,  or  health  (Institute  of  Medicine,  1998).  However,  officials  from 
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Figure  2.4  Notional  depiction  of  health-based  physical  fitness  standards  (Tier  I ) 


two  Services,  the  U.S.  Army  and  the  U.S.  Marines,  told  Government  Accounting  Office  staff  that 
appearance  was  one  objective  of  their  fitness  programs,  stating  that  image  is  an  important  compo¬ 
nent  of  effectiveness.  Since  the  image  of  a  soldier  is  one  of  leanness,  a  fat  appearance  could  weak¬ 
en  the  Military  image  and  undermine  effectiveness  and  thus,  readiness.  The  U.S.  Navy  reported 
that  appearance  is  not  an  appropriate  objective  of  their  body  fat  program  (Hodgdon,  1992),  but 
rather,  U.S.  Navy  body  fat  results  are  incorporated  into  a  member’s  rating  in  the  “Military  bearing” 
category  of  officer  fitness  reports  and  enlisted  personnel  evaluations. 

Several  studies  throughout  the  years  have  sought  to  establish  the  relationship  between  what  is 
accepted  as  Military  appearance  and  genuine  measures  of  body  composition  of  height  and  weight. 
A  brief  history  of  the  relationship  between  visually  judged  Military  appearance  and  actual  body 
composition  includes  the  work  of  Dupertuis  (1950)  who  found  a  correlation  of  -0.85  between 
endomorphy  ratings  and  body  specific  gravity.  In  1952,  Brozek  and  Keys  found  a  mean  correlation 
of  0.67  when  subjectswere  rated  before  and  after  a  period  of  semistarvation.  Ward,  Sutherland,  and 
Blanchard  (1976)  found  reliable  responses  of  body  fatness  by  visual  appraisal,  as  did  Blanchard, 
Ward.  Kryzwicki,  and  Cannam  (1979).  Sterner  (1984)  used  photographs  and  two  raters  to  estimate 
fatness.  Correlations  between  percent  fat  as  documented  by  hydrodensitometry  and  that  predicted 
from  visual  estimation  were  0.80  and  0.79  for  the  two  raters.  Test-retest  correlations  were  0.93  and 
0.95  for  these  same  raters. 

Subsequent  to  this  work,  Hodgdon,  Fitzgerald,  and  Vogel  (1990)  conducted  an  experiment  to 
determine  how  strongly  ratings  of  Military  appearance  and  fatness  were  associated,  and  to  consid¬ 
er  how  reliable  and  valid  assessments  of  fatness  could  be  made  in  a  Military  population  that 
includes  personnel  of  both  genders  and  various  ages  and  races.  The  subjectswere  1,326  U.S.  Army 
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Figure  2.5  Notional  depiction  of  performance-based  physical  fitness  standards  (Tier  II) 


active  duty  personnel,  including  men  and  women,  whose  body  composition  was  established  by 
hydrodensitometry.  Appearance  and  fatness  were  rated  by  11  personnel,  male  and  female,  officer 
and  enlisted,  and  African-American  and  Caucasian.  Fatness  was  rated  on  a  scale  from  1  (very  thin) 
to  7  (obese)  by  viewing  swimsuit  photographs.  Military  appearance,  using  Class  A  uniform  and 
swimsuit  photographs,  was  rated  on  a  scale  of  1  (poor)  to  5  (excellent).  Raters  were  asked  to  use 
their  own  personal  standards  to  assess  “Military  appearance.”  Hodgdon  et  al.  (1990)  found  that 
while  fatness  ratings  could  be  considered  valid  and  reliable,  ratings  of  appearance  did  not  fare  as 
well.  Ratings  of  appearance  in  uniform  were  not  highly  correlated  with  percent  body  fat  (0.53  for 
males  and  0.46  for  females). The  authors  conclude  that  factors  other  than  body  composition,  such 
as  subjectivejudgment,  may  influence  ratings  and  that  .  .it  is  not  feasible  to  establish  a  single  rat¬ 
ing  procedure  which  can  be  used  to  rate  both  Military  appearance  and  fatness”  (p.  22). 

Moreover,  appearance  and  readiness  concerns  may  even  be  incompatible  when  maximum  job 
performance  is  desired  (Institute  of  Medicine,  1998). The  U.S.  Navy,  for  example,  carried  out  tests 
that  investigated  the  relationship  between  their  physical  readiness  test  and  performance  of  materi¬ 
als  handling  tasks.  Materials  handling  appears  to  be  the  most  physically  taxing  task  for  U.S.  Navy 
personnel.  When  the  results  on  these  fitness  tests  and  two  of  the  occupational  handling  tasks  (box 
lift  and  box  carry)  were  correlated  with  body  composition  measures,  the  correlations  were  general¬ 
ly  weak.  Not  surprising,  the  one  body  composition  measure  (fat-free  mass)  was  significantlycorre- 
lated  with  the  box  lift  (Hodgdon,  1990).  In  other  words  the  more  absolute  muscle  mass  one  has, 
the  more  weight  one  can  generally  lift.  So  we  may  have  these  weak  relationships:  body  composi¬ 
tion  and  visual  appearance, body  composition  and  physical  performance,  and  appearance  and  per¬ 
formance.  In  fact,  it  has  been  suggested  that  heavier  women  (with  high  BMIs  and  more  absolute 
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muscle)  tend  to  do  much  better  on  many  Military  physical  tasks.  The  conundrum  then  is  that  gen¬ 
erally  many  heavier  women  may  be  better  able  to  accomplish  strenuous  Military  tasks  but  struggle 
to  meet  appearance  and  or  body  composition  standards. 


Summary 


Developing  fitness  standards  normally  involves  determining  some  minimum  level  of  individual 
physical  capacities  that  ideally  relate  well  to  occupational  job  performance.  For  candidate  selection 
or  worker  retention,  the  methodologies  must  be  reasonable  and  legally  supported  as  mandated  by  the 
Americans  with  Disabilities  Act  (Equal  Employment  Opportunity  Commission,  1991).  However, 
currently  there  are  many  workplace  scenarios  in  which  the  physical  requirements  of  the  job  are  often 
quite  low.  Still,  there  may  be  significant  interest  in  achieving  or  maintaining  a  reasonable  level  of 
worker  fitness  for  a  variety  of  reasons — general  health,  productivity,  physical  and  mental  readiness, 
and  so  forth.  This,  of  course,  begs  the  question  of  what  specific  level  of  fitness  or  physical  training 
is  desired  and  how  it  is  measured.  One  of  the  scientifically  supported  avenues  for  maintaining  or 
improving  health  is  through  physical  activity  or  training  (U.S.  Health  and  Human  Services,  1996). 
Thus,  rather  than  initially  attempting  to  require  a  certain  level  of  fitness  (or  training)  related  direct¬ 
ly  to  occupational  performance,  we  have  proposed  an  alternative  or  baseline,  physical  fitness  require¬ 
ment  for  selected  applications  (Military  or  general  population):  that  of  health-related  fitness.  This 
would  be  a  basic  level  of  fitness  for  overall  health,  quality  of  life  and  likely  increased  levels  of  phys¬ 
ical  performance,  including  both  occupational  and  recreational  activities. 

The  relationships  between  physical  activity,  health,  and  fitness  are  strongly  related  in  a  positive 
manner.  These  parameters  tend  to  also  reside  on  a  continuum  of  effects  with  a  distinct  degree  of 
interindividual  variability.  Although  it  is  generally  agreed  that  the  degree  of  improvement  in  gen¬ 
eral  health  status  is  often  closely  tied  to  the  magnitude  of  this  improvement  in  fitness  or  physical 
activity,  it  is  becoming  more  and  more  apparent  that  these  relationships  are  really  not  simple 
(Haskell,  1994)  but,  indeed,  rather  complex  (Bouchard  8 C  Shephard,  1993).  More  general,  physio¬ 
logic  adaptations  have  been  well  documented  in  the  literature.  In  fact,  however,  the  adaptive 
responses  to  physical  training  (or  exercise)  are  very  complex  and  normally  include  peripheral,  cen¬ 
tral,  structural,  and  functional  factors  (Pollock  et  al.,  1998).  Morever,  there  is  really  insufficient 
comparable  data  relative  to  the  specific  intensity,  frequency,  and  volume  of  training  to  fully  quan¬ 
tify  the  specific  benefit  outcomes  (Pollock  et  al.,  1991). 

The  studies  on  specific  morbidity  outcomes  and  physical  activity  or  fitness  are  voluminous.  As 
discussed  earlier,  several  major  reviews  of  these  topics  provide  a  wealth  of  scientific  insight  here, 
including  Physical  Athaty  and  Health  — A  Report  of  the  Surgeon  General  (1996),  and  the  Physical 
Activty,  Fitness  and  Health  Consensus  Statement  (Bouchard,  Shephard  8t  Stephens,  1993).  More 
specifically,other  studies  by  leading  experts  have  explored  the  effects  of  physical  activity  on  a  num¬ 
ber  chronic  conditions  and  diseases.  For  example,  these  afflictions  include  obesity  (DePietro,  1995; 
Stefanick,  1993),  hypertension  (Hagberg,  1989),  cardiovasculardisease(CVD)  (Farrell  et  al.,  1998; 
Lee,  Blair,  &Jackson,  1999;Stofan  et  al.,  1998),  and  diabetes  (Huet  al.,  1999).  Other  studies  have 
investigated  the  possible  benefits  of  physical  exercise  on  osteoporosis  (Recker  et  al.,  1992),  cancer 
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(Sessoet  al.,  1998),  clinical  depression,  stroke  (Kielyet  al.,  1994),  and  musculoskeletal  health  (e.g., 
back  pain  and  other  injuries).  Morevoer,  physical  activity  has  been  shown  to  have  salutary  effects 
on  more  than  one  condition  at  a  time. 

Besides  the  direct  benefits  of  improved  personal  health  and  therefore  decreased  health  costs, 
many  in  the  corporate  environment  suggest  that  health  promotion  in  the  workplace  is  ultimately 
cost  effective.  Overall,  a  number  of  studies  tend  to  suggest  potentially  beneficial  associations 
between  worker  health,  general  wellness,  and  productivity.  Although  these  relationships  are  statis¬ 
tically  significant,  the  true  strength  of  the  associations  may  be  generally  low  (  Bouchard,  Shephard, 
&  Stevens,  1993). Therefore,  some  uncertainty  still  exists  regarding  the  expected  cost-benefit  ratio 
when  instituting  an  employee  fitness  program.  However,  given  the  paucity  of  more  rigorous 
research  in  the  area,  improved  health  as  an  outcome  of  increased  physical  activity  has  its  own  inher¬ 
ent  return  regardless  of  a  highly,  more  favorable  cost-benefit  ratio  defined  in  strict  monetary  terms 
(Bouchard,  Shephard,  &  Stevens,  1993). 

We  note  that  risks  are  also  associated  with  engaging  in  regular  physical  activity.  Although  the 
most  severe  of  these  risks  may  be  an  increased  risk  of  heart  attack,  certainly  the  most  prevalent  and 
critical  risk  of  starting  an  exercise  program  would  be  to  not  continue.  The  scientific  consensus  is 
very  clear:  the  greatest  risk  is  not  engaging  in  any  physical  activity  at  all.  People  at  all  fitness  levels 
can  safely  implement  the  current  activity  recommendations  for  good  health,  tailoring  a  program  to 
specific  needs  if  necessary.  If  a  person  takes  the  appropriate  precautions  and  consistently  engages 
in  physical  activity  throughout  life,  the  quality-of-life  outcomes  can  be  enormous. 

The  recommendations  for  exercise  prescription  from  a  number  of  well-recognized  sources  were 
reviewed.  A  more  rigorous  history  and  review  of  physical  activity  recommendations  has  been 
accomplished  in  the  Surgeon  General’s  report  on  Physical  Activity  and  Health.  Generally.it  sug¬ 
gests  that  people  can  increase  their  physical  activity  in  many  ways.  It  is  acknowledged  that  both  the 
more  structured  (ACSM  Position  Stand)  or  lifestyle  (CDC/ACSM)  approaches  can  work  for  the 
relatively  sedentary  person.  Significant  points  of  agreement  across  the  board  are  that  most  persons 
should  accumulate  at  least  30  minutes  of  moderate  aerobic  activity  at  least  three  if  not  most  days 
of  the  week;  additional  health  and  functional  benefits  come  with  increased  exercise  intensity  and/or 
volume;  and  resistance  training  programs  should  be  accomplished  at  least  twice  a  week,  involving 
8  to  12  major  muscle  group  exercises  and  repetitions  each  (multiple  sets  are  not  required).  It  was 
further  noted  that  some  may  still  view  this  guidance  as  too  liberal.  This  seems  especially  true  with 
regard  to  the  issue  of  intermittently  performed  exercise  throughout  the  day. 

The  issue  of  (exercise)  dose  and  (physical)  response  might  be  viewed  as  the  description  of  how 
much  and  what  type  of  physical  activity  are  required  to  achieve  specific  health-related  or  perform¬ 
ance  outcomes.  Obviously,  there  is  some  disagreement  among  groups  as  to  the  specific  amount  of 
physical  activityrequired  for  “good  health.”  The  specific  dose  parameters  (volume, intensity,  modal¬ 
ity)  associated  with  improved,  health-related  responses  may  be  quite  different  from  those  typically 
identified  when  the  goal  is  physical  performance  improvement.  In  fact,  with  the  exception  of  out¬ 
comes  for  reduced  adiposity  or  insulin  resistance,  very  little  is  known  about  the  degree  of  health 
improvement  (or  response)  to  a  specific  dose  of  activity  (Haskell,  1994). 

This  issue  was  approached  from  a  more  qualitative  standpoint;  optimal  versus  adequate  versus 
minimal  doses.  An  adequate  exercise  dosage,  which  is  described  as  the  level  at  or  above  the  thresh¬ 
old  for  the  desired  physiological  change,  may  be  more  easily  identified  in  the  literature.  However, 
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very  little  is  currently  known  about  the  optimal  or  minimal  dosages  or  exercise  needed  for  the  desired 
effects.  Fortunately,  the  available  knowledge  is  still  sufficient  to  ensure  that  exercise  prescription  set¬ 
tings  are  adequately  directed  at  improving  health-related  outcomes  (Haskell,  1994). 

Clearly,  one  of  the  most  challenging  issues  is  correlating  the  exercise  prescription  with  the 
expected  test  outcome  or  metric.  It  has  been  suggested  that  by  subscribing  to  the  ACSM  guide¬ 
lines,  a  consistently  ready,  fit,  and  healthy  force  would  be  maintained  (Institute  of  Medicine,  1998). 
Although  this  may  appear  as  teleologically  acceptable,  it  generally  falls  short  as  a  testable,  fitness- 
related  standard.  Furthermore,  it  is  primarily  a  quantification  of  the  dose  and  not  a  metric  for  a 
measurable  outcome.  Similarly  .just  passing  a  medical  examination,  that  is,  presenting  with  no 
physiological  maladies,  does  not  ensure  compliance  with  a  program  of  regular  physical  activity.  In 
essence,  passing  a  medical  screen  is  the  first  (and  too  often  the  only)  selection  or  retention  standard 
to  qualify  for  ajob  or  career  entrance.  On  the  other  hand,  the  identification  of  a  minimal  prescrip¬ 
tion  dose  is  an  elusive  and  untestable. 

The  limited  number  of  past  attempts  to  “methodologically”identifiy  specific  fitness  test  stan¬ 
dards  or  metrics  were  reviewed.  At  least  for  the  aerobic  component  of  fitness,  the  findings  were 
really  more  qualitative  in  nature,  that  is,  the  best  evidence  for  a  cut-point  would  be  in  the  low  or 
fair  percentiles  from  a  normative  population  scatter.  On  the  other  hand,  BMI  values  of  25  to  27 
(wt/ht2)  describe  much  more  specifically  the  increased  health  risks  associated  with  overweight  and 
obesity,  respectively.  Little  if  any  work  has  been  done  to  elucidate  health  metrics  for  muscular 
strength  and  muscular  endurance.  This  may  well  prove  the  most  difficult  fitness  modality  to  address 
from  a  health-based  perspective. 

The  general  application  of  a  health-based  fitness  standard  may  lend  itself  better  to  specific 
Military  applications.  A  comparison  of  Military  readiness  and  appearanceissues  with  regard  to  health 
and  fitness  was  also  accomplished.The  Military  Services  need  to  be  a  fit  and  ready  force  so  that — 

1.  All  personnel  will  be  able  to  perform  well  under  deployment  or  emergency  conditions, 

2.  Everydayjobs  can  be  undertaken  safely  and  efficiently,  and 

3.  Costs  due  to  absenteeism  and  medical  problems  will  be  minimized. 

It  was  concluded  that  a  health-related  fitness  approach  is  consistent  with  general  Military  readi¬ 
ness  considerations.  Moreover,  factors  other  than  body  composition,  such  as  subjectivejudgment  of 
physical  appearance,  are  not  appropriate  metrics.  That  is,  it  is  not  scientifically  feasible  to  establish 
a  single  rating  procedure  to  rate  both  Military  appearance  and  body  fatness  (Palmer  et  al.,  2000). 

We  also  point  out  that  the  Military  environment  may  present  some  additional  challenges  as  well 
as  unique  opportunities  in  the  physical  skills  arena.  Although  the  Military  recognizes  the  impor¬ 
tance  of  adequate  physical  fitness  for  its  members,  it  could  be  argued  that  a  generic  test  to  ensure 
full  Military  readiness  and  mission  success  is  not  currently  practical  or  even  feasible.  Therefore,  the 
U.S.  Air  Force  for  one  is  exploring  a  two-tier  approach  to  establishing  physical  fitness  standards  for 
its  members.  The  objective  of  Tier  I  would  be  primarily  health-based/general  readiness  fitness,  with 
programs  and  standards  that  apply  to  all  U.S.  Air  Force  personnel.  Tier  II  would  focus  on  an  occu¬ 
pation-specific,  performance-based  fitness  program  that  will  further  enhance  mission  readiness  and 
accomplishment.  Clearly,  an  inability  to  meet  these  latter  standards  implies  an  increased  risk  of 
mission-specific  failure,  greater  physical  fatigue,  and  increased  risk  of  injury  for  these  career  fields. 
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Finally,  major  issues  have  been  raised  in  this  SOARregardingjob-specific  physical  performance 
tests.  Difficulties  in  instituting  job-specific  performance  tests  in  the  Military  have  been  described 
not  only  in  this  report  but  also  elsewhere  (Insitute  of  Medicine,  1998).  Examples  are  the  poten¬ 
tially  large  number  of  tests  the  Military  Services  would  be  required  to  devise  and  administer,  and 
the  frequency  with  which  people  are  assigned  to  new  occupations  and/or  promoted  (Institute  of 
Medicine,  1998),  along  with  the  potential  lack  of  test  sensitivity  and  specificity.  As  a  general  theme 
throughout  this  SOAR,  developing  validated  occupational  performance  tests  may  be  considered  a 
complex  process  in  and  of  itself.  The  only  feasible  approach  to  this  for  the  Military  would  be  to 
“group”jobs  into  testing  categories,  which  has  its  trade-offs.  Further,  tradeoffs  between  expected 
cost  saving,  available  resources,  and  legal  (ADA)  issues  are  seen  in  industrial  environments. 
Moreover,  performance  on  Military  fitness  tests  does  not  correlate  well  with  performance  on  task- 
specific  job  tests  or  even  strength  tests  required  for  those  Military  vocations  that  demand  at  least 
moderately  heavy  lifting  or  carrying  capacities  (Insitute  of  Medicine,  1998). 

Organizations  have  sought  to  establish  standards  forjob  candidate  assignment  and  worker  reten¬ 
tion  to  ensurejob  performance  and  safety.  Incorporating  standards  for  minimal  performance  on  tests 
of  physical  capacity  should  be  a  scientifically  defensible,  sometimes  lofty,  goal.  As  an  underlying 
theme  in  this  text,  the  scientific  process  to  establish  defensible  standards  maybe  considered  complex 
and  varied.  Therefore,  it  may  not  always  be  possible  to  achieve  the  desired  degree  of  test  validity. 
Rather,  one  must  work  with  what  is  at  least  minimally  acceptable  or  defensible.  Furthermore,  cost- 
benefit  concerns  and  resource  considerations  may  be  overly  constraining  to  the  idealized  outcomes. 
This  chapter  has  therefore  attempted  to  build  a  rationale  for  an  ancillary  approach  to  fitness  stan¬ 
dards  development:  health-based  fitness  levels.  The  scientific  literature  is  now  replete  in  supporting 
the  strong  association  between  physical  activity/fitness  and  general  heath,  wellness,  and  quality  of 
life.  Identifying  the  minimal  dose-response  relationships,  not  to  mention  truly  testable  metrics  or 
standards  has  proved  the  greatest  challenge.  An  overview  of  the  limited  attempts  thus  far  applied  for 
this  type  of  approach  to  the  process  of  physical  fitness  standards  development  has  been  presented. 
However  attractive  or  meritorious  this  endeavor  may  seem,  the  basic  observation  should  be  that  in 
application  or  practice  varying  levels  of  difficulty  may  be  encountered,  depending  on  the  chosen 
modality.  This  possibility  is  primarily  due  to  the  lack  of  sufficient  data  and/or  discrete  methodolo¬ 
gies  to  identify  clearly  defined  cut-points  on  which  to  base  standards.  Nevertheless,  this  should  not 
deter  further  efforts  to  investigate  or  apply  alternative  procedures. 

Endnote 

1.  Readiness  for  all  U.S.  Military  members  is  integral  to  the  basic  Military  requirement  of  maintaining  a 
state  of  readiness  to  go  to  war.  In  other  words,  it  is  expected  that  all  Military  members  will  be  prepared 
to  support  the  combat  mission  as  needed,  regardless  of  the  lack  of  rigor  associated  with  their  primary 
Military  occupations. 
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Abstract 


This  chapter  describes  the  process  of  conducting  ajob  analysis  or  Physical  Demands  Analysis, 
the  purpose  of  which  is  to  describe  and  quantify  those  aspects  of  physical  fitness  or  physical  per¬ 
formance  that  are  relevant  to  job  performance.  Given  the  interdependencies  between  aspects  of  the 
job,  the  environment,  and  the  employee,  adopting  a  systems  approach  is  essential  in  documenting 
and  quantifying  these  elements.  Conducting  ajob  analysis  should  be  an  employer's  first  step  to 
improve  the  integration  of  the  human  element  into  the  system. 

A  number  of  techniques  have  been  presented  in  this  chapter  to  identify  the  most  physically 
demanding  tasks  using  some  industrial/organizational  psychology  tools,  and  to  quantify  the  stress 
and  strain  associated  with  these  tasks  using  physiological,  biomechanical,  and  psychophysical 
approaches. The  strengths  and  limitations  of  the  various  approaches  are  discussed. 

The  issue  of  which  approaches  and  techniques  should  be  selected  by  the  investigator  will  depend 
on  many  factors,  including  the  nature  of  the  job  or  task  under  investigation,  the  extent  of  financial 
and  human  resources  available  to  support  the  work,  and  the  expertise  of  the  investigation  team.  In 
general  terms,  a  multidisciplinary  approach  performed  by  a  mixed  gender  and  aged  team  with  differ¬ 
ing  skills  and  perspectives  is  preferred,  as  it  is  more  likely  to  elicit  a  complete  and  balanced  output. 

Conducting  ajob  analysis  is  a  complex  process  that  requires  a  considerable  investment  of  time, 
money,  and  effort.  Good  science  and  good  judgment  are  required  in  equal  measure.  The  output 
should  provide  a  sound  foundation  for  establishing  occupational  fitness  standards,  focusing  physi¬ 
cal  training  programs,  identifying  health  and  safety  issues,  and  prioritizing  those  tasks  that  require 
job  redesign.  The  long-term  benefit  to  the  employer  of  implementing  these  strategies  will  be 
increased  productivity  through  improved  operational  effectiveness  and  reduced  injury. 
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intro 


What  Is  a  Job  Analysis? 

Given  the  focus  of  this  State  of-the-Art  Report  (SOAR)  on  occupational  fitness,  this  chapter 
concentrates  on  the  physical  demands  of  jobs  rather  than  on  other  psychosocial  aspects,  normally 
described  in  other  texts  on  job  analysis.  Physical  Demands  Analysis  is  a  more  appropriate  term  for 
job  analysis  in  this  context.  Most  of  the  published  sources  that  discussjob  analysis  do  so  from  an 
industrial  or  organizational  psychology  perspective,  and  although  our  focus  differs,  these  disciplines 
do  provide  some  useful  measurement  tools.  Most  ergonomics  and  human  factors  texts  cover  the 
subject,  but  again,  a  greater  emphasis  is  usually  placed  on  the  psychosocial  perspectives  and  less 
attention  is  afforded  to  the  physical  elements.  Perhaps  this  is  due  to  the  lack  of  physically  demand¬ 
ing  occupations  that  remain  in  our  increasingly  automated  and  sedentary  society  (refer  to  Chapter 
Ion  the  History  of  the  Standards  Development  Process  for  more  discussion  on  this  topic).  The 
Military,  the  emergency  Services,  and  a  relatively  few  number  of  jobs  in  the  profit  sector  are  the 
principal  source  of  physically  demanding  occupations. 

Ajob  analysis  involves  systematically  collecting  information  about  ajob  in  order  to  prepare  ajob 
description.  The  process  involves  determining  what  tasks  are  included  in  the  targetjob  and  what 
job  skills  or  other  employee  characteristics  are  required  Current  subject-matter  experts  can  provide 
information  on  whether  or  not  the  tasks  and  skills  arepart  of  or  required for  thejob,  and  their fre¬ 
quency  or  occurrence  or  use  on  thejob.  (Dwyer,  Prien  &Burke,  1987) 

Conducting  ajob  analysis  is  a  protracted  and  complex  process,  requiring  an  intriguing  blend  of 
science  and  judgment.  Although  the  scientific  approaches  described  in  this  chapter  can  take  us  so 
far,  our  enthusiasm  for  pursuing  scientific  rigor  should  be  tempered  by  common  sense  and  consid¬ 
ered  opinion.  Achieving  this  balance  between  science  and  considered  opinion  is  key  to  successful¬ 
ly  performing  ajob  analysis. 

The  objective  of  ajob  or  Physical  Demands  Analysis  is  to  describe  and  quantify  those  aspects 
of  physical  fitness  or  physical  performance  that  are  relevant  to  job  performance.  Given  the  close 
association  and  interdependencies  between  aspects  of  thejob  (equipment  used,  duration  and  fre¬ 
quency  of  tasks,  etc.),  the  working  environment  (temperature,  humidity,  noise,  etc),  and  the  physi¬ 
cal  capability  of  the  employee,  ajob  analysis  must  encompass  all  parts  of  the  system.  How  realistic 
would  it  be  to  assess  the  physical  demands  of  firefighting,  for  example,  without  considering  the  pro¬ 
tective  clothing  and  breathing  apparatus  worn  by  fire  fighters  and  the  thermal  load  experienced 
during  operations?  It  is  the  combined  load  on  the  human  body  that  must  be  considered,  not  just 
the  strength  requirement  to  drag  a  hose  or  lift  a  casualty. 

The  description  and  quantification  of  the  physical  stress  imposed  by  thejob  and  the  resultant 
strain  imposed  on  the  employee  is  a  vital  first  step  in  establishing  occupational  fitness  standards, 
whether  they  are  for  selection,  training,  or  retention.  The  output  from  ajob  analysis  provides  a  solid 
foundation  from  which  to  develop  and  validate  job-related  fitness  standards.  In  the  longer  term. 
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understanding  the  physical  demands  of  jobs  and  implementing  appropriate  fitness  standards  should 
also  serve  to  increase  productivity  through  improved  operational  effectiveness  and  reduced  injury. 
Improving  the  match  between  the  physical  capability  of  the  employee  and  the  physical  demands  of 
the  job  is  the  key.  The  output  from  ajob  analysis  may  also  serve  to  highlight  training  needs  and  to 
‘audit  the  effectiveness  of  current  fitness  training  programs  as  well  as  to  expose  health  and  safety 
issues  and  the  need  for  job  redesign.  In  short,  conducting  ajob  analysis  should  be  an  employer’s 
first  step  to  improve  the  integration  of  the  human  element  into  the  system. 

There  is  a  wide  range  of  job  analysis  techniques  available,  some  of  which  are  appropriate  to  a 
Physical  Demands  Analysis.  In  this  chapter,  we  attempt  to  encapsulate  the  key  elements  of  a 
Physical  Demands  Analysis.  However,  our  preferred  multidisciplinary  systems  approach  makes  it 
difficult  to  contain  within  one  section  without  omitting  significant  topics.  We  have  attempted  to 
find  a  reasonable  compromise.  Following  an  introduction,  an  overview  of  some  of  the  more  useful 
techniques  is  provided,  first,  for  identifying  the  most  physically  demanding  tasks  using  some  indus¬ 
trial/organizational  psychology  tools,  and  second,  for  quantifying  the  stress  and  strain  associated 
with  these  tasks  using  various  techniques  that  we  have  classified  under  the  titles  of  physiological, 
biomechanical,  and  psychophysical  approaches. 

Who  Sii  Conduct  a  Job  Analysis? 

Ajob  analysis  is  often  best  conducted  by  an  interdisciplinary  team  comprising  in-house  work¬ 
ers  and  supervisors  who  are  highly  familiar  with  the  jobs  under  scrutiny,  and  external  consultants 
who  can  bring  greater  objectivity  and  reliability  to  the  process.  It  is  useful  to  have  a  multiethnic  and 
multigender  team  because  different  perspectives  onjob  and  task  performance  are  often  valuable. 

Historically.definingthe  requirements  of  the  most  physically  demanding  occupations  has  involved 
investigating  the  performances  of  men,  since  it  is  men  who  hold,  or  at  least  held,  most  of  these  posi¬ 
tions.  However,  this  approach  maybe  problematic  because  men  and  women  perform  tasks  differently 
(Courville,  Vezina  &Messing,  1991 ;  Stevenson  et  al.,  1 990). Further,  occupations  and  equipmenthave 
usually  been  designed  by  men  for  men,  thus  creating  a  systemic  bias  against  women  in  equipment,  pay- 
loads,  and  work  organization  (Courville  et  al.,  199  l).Thesepotentiallybiasing  factors  should  be  borne 
in  mind  when  conducting  ajob  analysis,  and  efforts  should  be  made  to  counter  them. 

Which  Employees  Should  be  Evaluated? 


“Tocalculate  sample  size  for  a  Physical  Demands  Analysis  we  need  to  know  the  statistical  proper¬ 
ties  of  theprocess  generating  the  data,  the  width  of  the  confidence  interval,  and  the  probability  of 
the  interval  containing  the  true  value  ”  ( Wilson  &  Corlett,  1 995 ). 

In  reality,  this  information  is  usually  unknown,  and  selecting  a  suitable  sample  of  employees  to 
investigate  will  always  be  a  compromise  between  recruiting  the  numbers  and  variety  of  personnel 
to  fulfill  criteria  for  a  valid  statistical  analysis  and  recruiting  personnel  who  are  available  and  will- 
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ing  to  take  part.  The  extent  of  resources,  both  human  and  financial,  will  also  inevitably  and  unfor¬ 
tunately  impinge  upon  the  sample  size. 

In  simple  terms,  the  greater  the  sample  size  is,  the  greater  the  accuracy  of  the  data  will  be. 
Although  there  is  one  argument  to  select  “normaf’employees  and  “normaf’work  scenarios,  there  is 
another  to  opt  for  extremes  of  the  employee  population.  If  we  select  “normals”we  may  miss  com¬ 
pletely  some  of  the  minority  groups  in  the  work  force  and  some  of  the  more  unusual,  less  frequent 
tasks  that  might  contain  greater  physical  demands.  The  preferred  approach  will  certainly  involve 
adequate  representation  from  key  minority  groups  (usually  women  and  ethnic  minorities),  though 
absolute  target  numbers  will  depend  on  the  approach  adopted  and  the  resources  available.  Further 
details  about  sampling  (such  as  whether  it  should  be  random,  stratified,  or  clustered)  may  be  found 
in  more  specialized  texts  (e.g.,  Sinclair  1975, Ferguson  8c  Takane,  1989). 

Data  on  employees  should  be  recorded  anonymously  to  prevent  the  matching  of  particular  data 
to  particularpeople.  This  can  usually  be  achieved  fairly  easily  with  paper  recordsby  allocating  employ¬ 
ees  a  unique  identifying  number  and  using  the  number  only  on  paper  and  computerrecords.  Flowever, 
it  may  not  be  as  straightforwardwith  video  and  photographic  records.  As  a  minimum,  all  participants 
should  be  briefed  on  the  objectives  of  the  investigation  and  how  their  data  will  be  protected. 


Identifying  the  Physically  Demanding  Tasks  and  Elements 


There  are  two  broad  approaches  to  conducting  a  Physical  Demands  Analysis.  The  first  is  to 
focus  on  the  work  itself,  describing  the  purpose  of  the  job  and  the  equipment  used,  and  the  second 
is  to  focus  on  the  employee,  describing  the  physical  and  behavioral  requirements  of  the  job. 
Similarly,  there  are  many  techniques  that  can  be  employed  to  elicit  information,  organize  it,  and 
deploy  it  to  make  decisions  about  the  physical  job  demands.  These  include  observation,  interview, 
questionnaire  as  well  as  investigating  other  written  material  that  may  be  available. 

In  reality,  it  is  good  practice  to  use  a  combination  of  techniques,  since  this  should  provide  a 
more  holistic  and  valid  outcome.  The  selection  of  techniques  will  depend  on  a  number  of  factors, 
including  the  nature  of  the  job  under  investigation,  the  employee's  ability  and  willingness  to  under¬ 
stand  and  tolerate  the  different  techniques,  the  experience  and  the  preferences  of  the  user,  the  phys¬ 
ical  environment  in  which  the  analysis  must  be  conducted,  and,  inevitably,  the  resources  available. 
Some  techniques  require  verbal  or  written  input  from  employees,  others  are  observation  based  and 
require  minimum  input  from,  or  interruption  to,  the  work  force. 

Genera!  information  Gathering 

Before  implementing  these  measurement  techniques,  it  is  advisable  to  carryout  an  initial  familiar¬ 
ization  and  documentation  phase  in  which  the  person  conductingthe  Physical  Demands  Analysis  gets 
a  grip  on  the  job  and  gathers  relevant  documentation,  since  this  will  steer  subsequent  stages  of  the 
process.  The  objective  of  this  preliminary  phase  is  to  establish  what  the  job  entails,  its  objectives,  and 
its  task  elements.  A  relatively  small  effort  on  this  phase  may  avoid  much  subsequent  nugatory  activity. 
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Materials  such  as  a  job  description,  written  procedures,  training  manuals,  work  rosters,  shift 
schedules,  and  relevant  reports  covering  any  problems,  accidents,  or  legal  cases  that  concern  the  job 
or  task  under  scrutiny  should  be  collated.  These  and  other  sources  of  relevant  information  may  be 
available  from  the  Human  Resource  or  Occupational  Health  Servicesif  they  exist.In  particular,  any 
records  on  sickness  absence  and  musculoskeletalinjury  associated  with  employees  in  the  job  cate¬ 
gories  of  interest  may  provide  useful  insights  into  problem  areas. 

Although  these  sources  of  information  are  too  superficial  to  use  alone,  they  can  guide  more 
detailed  investigation.  They  also  help  to  ensure  that  all  of  the  critical  aspects  of  thejob  (i.e.,  a  prop¬ 
er  representation  of  thejob)  will  be  covered.  Employees  at  different  levels  within  the  organization, 
including  job  incumbents,  supervisors,  union  representatives,  and  managers,  should  conduct  checks 
for  completeness  and  relevance  of  all  work  tasks. 

Task,  Environmental,  and  Human  Factors 

An  employee’s  ability  to  fulfill  the  requirements  of  ajob  depends  upon  three  inter-related  com¬ 
ponents  —  the  task  itself,  the  environment  in  which  the  task  is  performed  and  the  capability  of  the 
worker.  Each  of  these  three  components  has  a  number  of  elements.  It  is  these  elements  that  must 
be  documented  and  quantified  in  ajob  analysis  using  the  various  techniques  and  approaches  that 
are  described  in  subsequent  sections.  Describing  these  components  and  elements  in  detail  is 
beyond  the  scope  of  this  chapter,  but  an  outline  of  these  topics  is  provided  below.  For  a  more 
detailed  discussion  the  reader  is  referred  to  the  textbook  and  article  by  Ayoub  6cMital  (1989)  and 
McDaniel  (1998),  respectively. 

Task  elements — The  task  elements  describe  the  mode,  frequency,  intensity,  and  duration  of  the 
task,  the  postures,  any  objects  involved  in  the  task  and  any  equipment  that  is  used. 

The  mode  or  type  of  activity  is  important  as  some  activities  are  more  readily  sustainable  than 
others,  or  involve  a  greater  or  less  stress  or  strain.  The  main  source  of  variation  is  probably  the 
amount  of  muscle  mass  involved  in  different  activities  and  whether  the  effort  involves  static  or 
dynamic  muscle  activity.  For  example,  higher  levels  ofwork  are  sustainablewhile  cycling  compared 
with  lifting  (Petrofsky  6c  Lind,  1978a)  and  while  running  compared  with  loaded  marching 
(Rayson,  Bell,  Davies,  6c  Rhodes-James,  1995).  The  greater  the  frequency  and  intensity  of  a  task, 
the  greater  is  the  stress  of  that  task  and  the  greater  is  the  resultant  strain  on  the  employee.  Thus 
documenting  the  mode,  frequency,  duration,  intensity,  and  rest  periods  is  a  critical  aspect  of  quan¬ 
tifying  the  demands  of  thejob. 

Posture  refers  to  the  position  the  body  adopts  to  initiate  an  activity.  The  same  activity  can  often  be 
performed  in  different  postures,  e.g.,  lifting  in  a  stoop  (straightlegs),  squat  (straightback)  or  fi-ee-style 
(semi-squat).  Squatposture  is  biomechanicallyleast  stressful  in  lifting  tasks,  but  the  stoop  posture  leads 
to  lower  energy  expenditure.  The  fi-eestyle  posture  is  considered  least  stressful  or  least  tiring. 

A  variety  of  factors  associated  with  any  object  that  is  handled  can  affect  task  performance  and  hence 
these  factors  should  be  documented.  These  include  the  dimensions,  symmetry  and  the  presence  or 
absence  of  handles  (Ayoub  and  Mital,  1989).  The  dimensions  of  objects  have  an  effect  on  handling 
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capacity,  energy  expenditure  and  spinal  stresses.  Handling  loads  with  handles  is  safer  and  less  stress¬ 
ful  and  may  allow  employees  to  handle  approximately  15%more  load  (Snook &Ciriello,  1991). 

The  ability  to  handle  load  is  also  a  function  of  the  vertical  height  and  distance  of  lift  (Snook, 
1978a  and  b).  Lifting  capability  decreases  with  vertical  height  above  the  ground.  The  decrease  in 
capacity  may  be  as  large  as  23%  when  the  starting  point  of  the  lift  is  changed  from  knuckle  to  shoul¬ 
der  height  for  example.  The  height  of  lift  also  affects  the  stresson  the  spine.  Lifting  from  the  ground 
is  more  stressful  than  lifting  from  knee  or  hip  height.  Asymmetrical  loads  are  more  stressful  (intra 
disc  pressures  and  shear  forces  increase)  to  lift  therefore  reducing  capability  (Mital  and  Fard,  1986). 

Environmental  elements  —  The  environmental  elements  describe  the  ambient  temperature,  humid¬ 
ity,  altitude,  noise,  air  pollution,  work  space,  clothing  etc.  Core  temperature  is  maintained  in  humans 
over  a  relatively  small  range  between  approximately  36  to  40°C.  Outside  of  this  range,  thermal  reg¬ 
ulation  is  impaired  or  even  lost.  During  activity,  heat  production  is  greatly  increased,  and  if  this  heat 
is  not  dissipated,  core  temperature  increases  rapidly  to  intolerable  levels.  During  the  wearing  of  pro¬ 
tective  clothing  or  in  hot,  humid  climates,  heat  loss  mechanisms  are  impaired  (Kolka,  1992). 

Even  under  a  relatively  modest  heat  load,  heart  rate,  core  temperature  and  sweat  rates  increase, 
and  work  output  declines.  For  example,  in  one  study  where  the  temperature  was  increased  from  1 7 
to27'C,  lifting  capacity  declined  by  20%,  pushing  capability  by  16%  and  carrying  capability  by  11% 
(Snook  8c  Ciriello,  1974). 

Restrictions  in  working  space  on  posture,  movement  and  working  capacity  have  been  docu¬ 
mented  both  in  terms  of  increased  physiological  and  biomechanical  strain  and  perceived  discom¬ 
fort  ratings  (Mital,  1986).  Subjects  experienced  a  13%  decline  in  carrying  capabilities  when  loads 
had  to  be  carried  through  a  56cm  wide  passage,  for  example.  Often  both  the  body  and  load  have 
to  be  re-oriented  resulting  in  slower  and  cautious  movement.  Limited  headroom  also  reduces  lift¬ 
ing  capacity  (Ridd,  1985). 

Human  elements  —  The  human  elements  describe  the  somatic  factors,  such  as  gender,  age,  body 
dimensions; psychic  factors,  such  as  attitude  and  motivation;  sensory  factors  such  as  perception,  inte¬ 
gration,  and  transmission  of  information;  health  and  physical  training  state.  Of  particular  interest  to 
us  due  to  our  concern  with  occupational  fitness  are  the  somatic  and  health  and  fitness  factors.  These 
factors  although  not  strictly  relevant  to  ajob  analysis  (they  do  not  define  the  job  itself)  are  of  rele¬ 
vance  to  a  Physical  Demands  Analysis  as  they  impact  on  the  employee’s  ability  to  perform  the  job. 

Measures  of  body  size  and  composition  are  well  documented  as  predictors  of  Military  task  per¬ 
formance,  including  lifting  capability  (Nottrodt  &Celentano,  1987;Rayson,  Holliman  &  Belyavin, 
2000),  carrying  capability  (Rice  8c  Sharp,  1994)and  loaded  marching  (Frykman  8c  Harman,  1985; 
Rayson,  Holliman,  8c  Belyavin,  2000).  In  general  terms,  larger  employees  with  greater  muscle  mass 
and  less  body  fat  perform  physical  demanding  tasks  more  effectively,  though  the  exact  relationship 
between  these  measures  and  performance  varies  according  to  the  details  of  the  task. 

Fitness  scores,  especially  on  muscle  strength,  muscle  endurance,  muscle  power  and  aerobic 
capacity  have  long  been  shown  to  be  positively  associated  with  job  performance  (e.g.,  Sharp  et  al., 
1980;  Rayson,  Holliman,  8c  Belyavin,  2000)  and  negatively  associated  with  risk  of  injury  (Chaffin, 
1974;  Jones,  Bovee,  8c  Knapik,  1992; Harwood,  Rayson,  ScNevill,  1999).  Indeed,  it  is  intuitive  to 
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expect  that  strong  and  aerobicallyfit  employees  are  more  likely  to  perform  the  job  effectively,  have 
a  greater  reserve  capacity  and  hence  are  less  susceptible  to  injury. 

The  influence  of  age  and  gender  on  job  performance  is  relevant  insofar  as  women  and  older 
employees  typically  have  a  lower  physical  capability  and  therefore  a  lower  work  tolerance  and  less 
reserve  capacity  and  their  male  and  younger  contemporaries.  However,  recent  implementation  of 
so-called  gender-freejob-related  physical  selection  standards  by  the  Armed  Forces  in  some  coun¬ 
tries  such  as  the  United  Kingdom  (Rayson,  Pynn,  Rothwell,  &Nevill,2000),  has  reduced  the  rel¬ 
evance  of  age  and  gender  of  employees.  In  theory,job-related  selection  standards  vaH  ensure  that 
all  employees  have  the  required  physical  capability  to  meet  the  requirements  of  the  job  and  hence 
age  and  gender  afford  less  significance. 


Observation 

Careful  observation  of  work  performance  can  provide  data  on  the  occurrence,frequency,  and  dura¬ 
tion  of  specific  activities. This  technique  is  most  useful  in  a  Physical  Demands  Analysis,  as  opposed 
to  other  aspects  of  job  demands  (e.g.,  cognitive  and  social)  as  it  depends  on  visual  activities,  and  the 
physical  components  are  normally  readily  visible.  Some,  such  as  static  efforts  may  not  be,  however. 

All  observational  methods  have  the  deceptive  appearance  <f  simplicity, giving  thepotential  userthe 

impression  that  their  use  is  easy  and  their  results  simple  to  determine  and  conclusive. 

Unfortunately,  this  is  not  so,  andpotential  users  should  be  aware  <f  the  need for  training  in  the 

method \  monitoring  of  its  use  and  supporting  knowledgefor  the  effective  application  of  its  results. 

(Wilson  &  Corlett,  1995) 

Observation  is  ideally  suited  to  jobs  or  tasks  that  involve  short  and  repetitive  work  cycles.  In 
these  types  of  activities,  several  hours  of  observation  may  suffice  to  capture  the  entire  extent  of  the 
physical  requirements.  By  contrast,  jobs  that  are  not  structured  or  repetitive  may  require  extended 
periods  of  observation  before  a  single  relevant  incident  is  observed.  Similarly,  tasks  with  variable 
physical  demands  may  not  be  best  suited  to  this  technique.  A  law  enforcement  officer  making  an 
arrest  provides  an  example  of  such  a  task.  It  may  require  several  days  of  observation  before  a  single 
incident  is  captured,  and  the  variability  surrounding  the  manner  in  which  this  task  is  carried  out, 
may  require  many  repeated  exposures  before  “typicaf'modus  operandi  can  be  assured. 

If  the  task  is  conducted  quickly,  or  involves  complex  or  highly  skilled  movements,  it  may  not  be 
possible  for  the  observer  to  keep  up  with  events,  and  direct  observation  may  not  be  appropriate. 
The  use  of  video  recording  and  subsequent  data  analysis  may  overcome  this  problem.  Videotaping 
may  also  overcome  another  potential  weakness  of  this  technique  —  the  presence  of  an  observer 
interfering  with  the  normal  pattern  of  behavior.  Often,  observation  is  best  deployed  alongside  other 
information-gathering  techniques  such  as  interview  or  questionnaire. 

Activity  sampling  is  a  time-structured  observational  approach  in  which  snapshots  of  activity  are 
coded  at  predetermined  time  intervals,  which  might  range  from  10  seconds  to  1  hour.  The  objec¬ 
tive  of  activity  sampling  is  to  quantify  the  proportion  of  time  spent  performing  different  activities 
often  with  a  view  to  focusing  further  investigation  into  the  more  frequent  tasks.  In  preparing  to  use 
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this  approach,  there  are  four  issues  that  need  to  be  considered,  which  include  the  classification  of 
activities,  the  development  of  a  sampling  schedule,  information  collection  and  recording,  and  the 
actual  analysis  of  activity  samples.  These  are  described  in  detail  by  Kirwan  and  Ainsworth  (1992) 
on  pages  42  to  44,  but  a  brief  description  of  each  of  the  four  issues  follows. 

T  o  ensure  that  all  of  the  useful  activities  are  recorded,  observers  need  to  familiarize  themselves 
with  the  activity  and  ideally  pilot  the  data  collection  procedure,  before  data  collection  commences 
in  earnest.  Up  to  20  discrete  activities  can  be  coded  unless  video  is  used,  in  which  case  any  number 
can  be  managed.  The  activities  targeted  must  be  clearly  distinguishable  and  the  limits  identified 
(i.e.,  the  start  and  end). 

The  sampling  interval  can  be  calculated  according  to  the  Nyquist  criterion — the  interval 
between  samples  must  be  less  than  or  equal  to  half  the  duration  of  the  shortest  activity  that  is  being 
coded.  For  example,  if  a  gunner  in  a  tank  must  load  and  fire  a  shell  every  1  minute,  replenish  shell 
supplies  every  2  hours,  and  change/fill  the  fuel  tanks  every  8  hours,  a  sampling  interval  of  30  sec¬ 
onds  or  less  would  be  required  over  a  number  of  days.  Either  a  fured  or  a  random  sampling  inter¬ 
val  can  be  adopted.  Fixed  is  normally  preferred  unless  the  tasks  are  of  very  long  duration,  or  unless 
the  repetition  rate  is  very  high,  in  which  case  fixed  sampling  can  lead  to  a  systematic  bias.  Sampling 
is  normally  continued  for  sufficient  duration  to  sample  the  full  range  of  activities  and  should  yield 
approximately  1 .000  sampling  points  for  each  session. 

Before  the  measurement  phase  commences,  the  employee  should  be  briefed  and  asked  to  per¬ 
form  the  work  as  normal.  The  level  of  detail  recorded  can  vary  from  a  simple  tally  in  which  the 
occurrences  of  a  particular  activity  are  recorded  (but  only  frequency  will  be  elicited)  to  recording 
the  activities  sequentially,  in  which  case  both  frequency  and  duration  will  be  obtained. 

Kirwan  and  Ainsworth  emphasize  the  need  for  observers  to  devote  sufficient  time  to  ensure  they 
are  familiarwith  the  task  under  investigation  and  to  check  that  the  task  categories  are  clear,  exhaus¬ 
tive,  and  mutually  exclusive.  Carrying  out  a  small  pilot  study  will  highlight  any  unforeseen  problems. 
A  category  for  “other  activities”  should  provide  a  useful  catch-all,  at  least  in  the  pilot  study.  More  cat¬ 
egories  may  need  to  be  defined  if  uncoded  activities  keep  recurring.  A  programmable  beeper  should 
be  used  to  prompt  data  collection.  If  video  is  used,  a  time  stamp  can  be  recorded  on  the  image. 

Recently,  we  successfully  used  handheld  computers  in  the  field  to  log  information  as  it  occurs 
(unpublished  data).  Activity,  posture,  terrain,  and  events  were  logged  directly  onto  the  computer 
using  drop-down  menus  arranged  in  a  hierarchical  manner.  Additional  information  can  be  record¬ 
ed  in  free  text  either  directly  onto  the  scribble  pad  on  the  small  screen  or  orally  on  minitape  or  disk 
recorders.  At  the  end  of  the  recording  session  the  data  on  the  palmtops  are  downloaded  into  an 
Access  database  for  later  merging  and  analysis. 

Observation  is  a  useful  means  of  identifying  the  sequence  of  activities  involved  in  a  task  as  well 
as  documenting  in  an  objective  manner  the  frequency  and  duration  of  activities.  We  have  found  it 
useful  to  verify  the  accuracy  of  other  forms  of  data — sometimes  the  way  in  which  the  job  is  actu¬ 
ally  performed  differs  significantly  from  the  description  in  the  training  manual  or  user  instructions. 
Observation  can  also  uncover  additional  activities  that  had  not  been  foreseen.  Care  must  be  taken 
when  analyzing  the  data  to  distinguish  between  frequency  and  criticality.  Just  because  an  activity  is 
frequent  does  not  necessarily  mean  it  is  an  important  aspect  of  job  performance.  Similarly,  a  criti¬ 
cal  task  may  only  be  performed  once  a  month,  but  if  the  employee’s  life  or  a  colleague’s  life  depends 
on  it,  the  fact  that  the  task  is  performed  infrequently  may  be  irrelevant. 
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Questionnaire 


The  use  of  questionnaires  can  be  a  cost-effective  technique  to  elicit  information,  though  self¬ 
completion  questionnaires  are  only  appropriate  for  personnel  who  can  verbalize  well  on  paper. 
Questionnaires  can  be  distributed  personally  or  by  post  and  can  be  completed  at  any  suitable  time 
and  location.  They  can  be  structured  with  precoded  responses,  or  be  “open,”allowing  an  employee 
to  write  down  a  response  in  his  or  her  own  words.  There  are  advantages  and  disadvantages  with 
both  variants,  but  one  major  advantage  of  the  precoded  version  is  the  ease  of  data  collation  and 
analysis,  as  well  as  a  lower  employee  burden.  Questionnaires  can  also,  of  course, be  administered  by 
an  interviewer,  overcoming  some  of  the  limitations  with  the  self-completion  approach. 

Most  of  the  questions  in  questionnaires  hinge  around  two  major  issues:  the  relative  importance 
of  the  task  to  thejob  and  how  often  the  task  occurs  (criticality  and  frequency).  Some  of  the  better 
known  questionnaires  used  for  job  analysis,  such  as  the  Position  Analysis  Questionnaire  (PAQ) 
(McCormick,  Jeanneret,  Sc  Mecham,  1972),  the  Occupation  Analysis  Inventory  (Cunningham, 
Boese,  Neeb,  6t  Pass,  1983),  and  the  Work  Profiling  System  (Saville  6c  Holdsworth,  1989)  are 
large,  time-consuming  questionnaires  that  are  not  well  suited  to  investigating  physical  behaviors. 
They  take  too  long  to  administer  and  elicit  large  amounts  of  information  that  are,  by  and  large, 
irrelevant  to  our  needs. 

However  some  job-orientated  techniques,  such  as  Fine  and  Wiley’s  (1971)  Functional  Job 
Analysis  (FJA)  (see  also  Fine  6c  Cronshaw,  1999)  and  Annett,  Duncan,  Stammers  6c  Gray’s  (1971) 
Hierarchical  T ask  Analysis  ( HT  A)  can  be  useful.  FJA  is  a  technique  used  to  describe  what  employees 
do  in  standardized  language. The  focus  is  on  the  tasks  they  perform  (i.e,,  purposeful  actions  organ¬ 
ized  over  time  to  address  an  objective).  HTA  analyzes  the  tasks,  dividing  them  into  increasingly  spe¬ 
cific  subtasks  in  a  hierarchical  fashion.  An  alternative  employee-orientated  technique  is  Repertory 
Grid  Technique  (Kelly,  1955).  Its  advantage  over  some  of  the  other  employee-orientatedapproaches 
(e.g.,  P  AQ  is  that  the  employee  is  not  limited  in  his  or  her  response  by  prestructured  categories. 

Other  techniques  such  as  the  Task  Ability  Scales  (Fleishman  6c  Quintance,  1984),  the 
Threshold  Traits  Analysis  (Lopez,  1986),  and  the  Minnesota  Job  Requirements  Questionnaire 
(MJRQ  (Desmond  &  Weiss,  19736c  1975)  can  be  useful  during  a  Physical  Demands  Analysis. 
Fleishman’s  Task  Ability  Scales  include  psychomotor  characteristics.  Their  relevance  to  successful 
performance  on  specified  tasks  is  rated  using  graphic  7-point  scales. Three  points  on  the  scale  are 
anchored  by  examples  of  concrete  action.  Lopez’s  tool  measures  the  relative  importance  of  33  char¬ 
acteristics  spanning  5  main  attributes,  including  physical  characteristics.  Brief  work-oriented  defi¬ 
nitions  are  provided  for  each  characteristic.  The  interviewee  is  asked  the  importance  of  the  charac¬ 
teristic  as  ajob  requirement  and  the  weight  it  has  for  total  work  performance.  The  MJRQjs  a  short 
questionnaire  with  45  items,  some  of  which  encompass  physical  ability  (e.g.,  “precise  movement  of 
fingers  in  the  handling  of  very  small  objects”). The  employee  has  to  indicate  on  a  7-point  scale  the 
importance  of  the  specified  actions  or  activities.  For  further  details  consult  the  original  papers  or 
for  an  overview  refer  to  Drenth,  Thierry,  6c  de  Wolff  ( 1998). 

The  checklist  is  an  example  of  a  precoded  questionnaire  in  which  the  employee  normally  indicates 
by  circling  or  scoring  whether  a  particular  event  does  or  does  not  occur.  However,  with  such  closed 
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measurement  vehicles,  adequate  attention  must  be  devoted  beforehand  to  ensure  that  the  correct 
issues  are  being  addressed,  the  right  questions  are  being  asked,  and  the  right  language  is  being  used. 

An  example  of  this  type  of  approach  applied  to  a  Physical  Demands  Analysis  would  be  to 
design  and  implement  a  task-specific  inventory  that  includes  work-task  statements  that  clearly 
define  specific  work  tasks.  The  task  list  might  be  drawn  up  by  the  investigator  and  finalized  by 
observation  and  discussion  with  employees  and  supervisors.  An  example  of  task  descriptions  used 
by  Prof.  Tony  Jackson  in  an  oil  production  facility  is  provided  below.  The  task  list  is  then  adminis¬ 
tered  to  a  random  sample  ofworkers  who  rate  each  task  in  terms  of  its  importance  (criticality) ,  fre¬ 
quency,  time  spent,  and  sometimes  difficulty.  Difficulty  can  be  assessed  using  the  rating  of  per¬ 
ceived  exertion  scale  (Borg,  1985)  or  the  physical  effort  scale  (Fleishman,  Gebhardt,  &  Hogan, 
1984).  A  weighted  index  can  then  be  calculatedfor  each  task  on  the  basis  of  the  relative  priorities 
attributed  to  importance,  frequency,  and  time  spent  on  those  tasks  that  are  classified  as  the  most 
difficult.  Those  job  tasks  that  are  identified  as  key  tasks  then  form  the  basis  of  any  further  initia¬ 
tives,  such  as  the  development  of  selection  or  retention  tests. 

An  alternative  to  inventories  and  checklists  is  activity  diaries.  Asking  employees  to  maintain 
activity  diaries  that  document  their  activities  during  predefined  periods  (e.g.,  typically  in  30-  or  60- 
minute  blocks )  can  provide  useful  information.  However,  the  burden  on  the  employee  is  quite  high, 
and  the  accuracy  and  usefulness  of  the  information  is  variable. 

Questionnaires  are  also  available  to  gather  data  on  the  incidence,  prevalence,  and  causes  of  mus¬ 
culoskeletal  injury  in  the  work  force.  The  Nordic  Questionnaire  (NMQ)  (Kuorinka  et  al.,  1987) 
provides  one  suitable  example.  After  a  personal  details  section,  general  survey  questions  provide 
indications  about  prevalence  and  disability.  Different  sections  covering  four  separate  body  areas  fol¬ 
low  to  establish  the  severity  of  any  disorder.  A  general  health  section  rounds  off  the  questionnaire. 


Worked  Example  1 

Tasklist  developed  for  an  oil  production  facility 

1.  Breaking-turning  valves  at  wellhead 

2.  Breaking-turning  valves  in  awkward  positions 

3.  Breaking-turning  4"  to  6"  valves  (low  pressure) 

4.  Breaking-turning  2"  to  4"  valves  (high  pressure) 

5.  Lifting— carrying  valves  and  flanges 

6.  Lifting- carrying  pumps  and  motors 

7.  Manipulating  heavy  fittings  in  awkward  positions 

8.  Lifting  30-50  lbs.  to  waist  height 

9.  Lifting  50-75  lbs.  to  waist  height 

10.  Lifting  75  to  100  lbs.  to  waist  height 

1 1 .  Lifting  30-50  lbs.  to  above  waist  height 

12.  Lifting  50-75  lbs.  to  above  waist  height 
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Interview 


An  interview  is  normally  conducted  with  one  person  but  can  be  conducted  with  a  group.  A 
series  of  questions  are  asked,  and  the  responses  are  recorded  either  on  paper  or  directly  punched 
into  the  computer.  Alternatively,  the  whole  process  is  recorded  on  tape  or  video  for  later  analysis. 
Interviews  can  be  either  structured,  whereby  the  questions  are  specified  in  advance  (e.g.,  orally 
administered  questionnaires),  or  unstructured,  whereby  only  a  general  topic  is  defined  and  the 
interviewer  encourages  the  respondent  to  talk  about  anything  within  that  topic. 

The  Critical  Incidents  Technique  (Flanagan,  1954)involves  askingjob  incumbents  or  supervi¬ 
sors  to  provide  examples  of  effective  and  ineffective  performance.  The  idea  is  that  people  remem¬ 
ber  critical  incidents  even  though  they  occur  relatively  infrequently.  The  focus  is  on  both  positive 
and  negative  events  that  have  a  potentially  important  effect  on  system  objectives.  Once  several  hun¬ 
dred  examples  have  been  collected,  they  are  categorized  into  10—20  groups,  via  several  reiterative 
processes  by  knowledgeable  employees  (knowledgeable  about  the  job  concerned). The  resulting  cat¬ 
egories,  in  our  case  of  physical  dimensions,  are  deemed  to  be  relevant  to  effectivejob  performance. 

This  open-ended  technique  is  best  employed  early  in  the  job  analysis  as  it  sometimes  uncovers 
key  problem  areas  in  a  cost-effective  manner.  It  can  simply  be  used  in  an  open  question  such  as 
“describe  an  incident  which  occurred  to  you  or  another  employee  you  were  watching  while  launch¬ 
ing  the  rescue  boat/constructing  a  bridge/hauling  up  a  ladder.”  Or  it  can  be  used  in  a  more  system¬ 
atic  manner  by  interviewing  employees  on  a  regular  (weekly,  monthly)  basis  or  by  asking  employees 
to  respond  anonymously  by  filling  in  and  returning  forms.  The  advantages  of  this  technique  include 
its  ability  to  pick  up  rare  events  that  might  not  be  uncovered  by  other  techniques  (e.g.,  observation). 
The  disadvantages  lie  in  its  reliance  on  accurate  memory  and  reporting  by  people. 

The  previously  mentioned  Repertory  Grid  is  a  similar  interview  method  in  that  it  too  tries  to 
elicit  specific  examples  of  successful  and  unsuccessful  performance.  However,  the  technique  focus¬ 
es  on  the  person  rather  than  the  task  by  asking  the  employee  to  identify  ways  in  which  an  effective 
employee  differs  from  an  ineffective  employee.  The  interviewer  explores  examples  of  task  perform¬ 
ance  and  their  relationship  with  the  types  of  physical  attributes.  For  example,  which  physical  char¬ 
acteristics  does  the  effective  employee  have  that  the  ineffective  employee  lacks? 


Approaches  to  Quantifying  the  Physical  Demands  of  Jobs 


This  section  outlines  different  approaches  for  quantitatively  assessing  the  physical  stress  (the 
demands)  and  the  strain  ( the  person’s  response)  associated  with  work  tasks.  Obtaining  data  on  fac¬ 
tors  such  as  posture,  force,  and  intensity  of  work  assists  in  the  understanding  of  work  demands  by 
quantifying  them.  These  data  may  be  used  for  assembling  a  battery  of  candidate  fitness  tests,  for 
identifying  tasks  or  jobs  in  need  of  redesign,  for  comparing  the  demands  of  jobs  before  and  after 
job  redesign,  and  for  setting  objective  targets  for  rehabilitating  injured  personnel. 

Although  simple  generic  task  work  rates  can  be  derived  from  tables  or  predicted  using  equa¬ 
tions,  more  complex  occupational  tasks  must  be  uniquely  measured.  There  are  three  basic  meas- 


Human  Systems  IAC  SOAR,  2000 


77 


urement  approaches  to  quantifying]  ob  demands — physiological.biomechanical,  and  psychophysi¬ 
cal  (Ayoub&cMital,  1989).  Again,  including  more  than  one  approach  is  preferable  to  avoid  bias  in 
perceptions  and  drawing  incorrect  conclusions  about  the  demands  of  an  occupation. 


Physiological  Approach 


The  physiological  approach  appraises  the  strain  on  the  cardiovascular  and  respiratory  systems 
through  the  measurement  of  responses  such  as  heart  rate  (HR), oxygen  uptake  rate  (VO2)  or  lactate 
accumulation.  During  dynamic  activities,  such  as  walking  or  running  in  which  the  primary  energy  sup¬ 
ply  is  via  aerobic  metabolism,  HR  and  V  0  2  are  linearly  related  to  the  work  performed,  so  the  intensi¬ 
ty  of  the  work  can  be  estimated  by  measuring  either  HR  or  VO2.  During  static  activities,  such  as  main¬ 
taining  a  posture  or  holding  an  object,  the  physiological  responses  may  be  different  from  dynamic  activ¬ 
ities  since  the  HR  response  to  static  exercise  is  largely  independent  of  the  bulk  of  muscle  involved. 

Semidynamic  activities,  such  as  lifting  and  carrying,  includeboth  dynamic  and  static  components. 
The  static  component  in  these  tasks  may  be  significant,  both  in  grasping  the  object  to  be  lifted  and 
in  postural  control. These  static  components  may  be  integral  to  performing  the  task  but  they  may  not 
contribute  to  accomplishing  the  work  (i.e.,  raising  the  mass  a  given  height).  Thus,  the  physiological 
responses  to  the  static  components  are  superimposed  on  the  responses  to  the  dynamic  components. 

Anaerobic  metabolism  and  the  role  that  anaerobic  metabolism  plays  in  meeting  the  energy 
demands  ofwork  tasks  is  a  complex  subject.  At  a  simplistic  level,  anaerobic  metabolism  plays  a  sig¬ 
nificant  role  at  the  onset  of  dynamic  activity  before  the  cardiovascular  and  aerobic  energy  systems 
have  time  to  catch  up  with  the  work  demand,  and  an  oxygen  deficit  is  incurred,  and  during  intense 
dynamic  activity  in  which  the  energy  demands  outstrip  the  ability  of  the  aerobic  system  to  meet  the 
requirement  (Figure  3.1).  Anaerobic  metabolism  can  result  in  the  accumulation  of  lactate  and  mark 
the  onset  of  fatigue  and  ultimately  the  cessation  of  activity.  Lactate  accumulation  occurs  when  the 
rate  of  production  exceeds  the  rate  of  removal,  either  due  to  the  sheer  volume  being  produced  or  to 
the  reduced  blood  flow  and  subsequent  impaired  removal.  Lactate  accumulation  results  in  percep¬ 
tions  of  fatigue  and  reduced  contractile  ability  of  the  muscle,  forcing  the  employee  to  stop  or  slow 
down  until  pH  rises  again  and  a  reasonable  level  of  homeostasis  is  restored. 

The  contribution  of  anaerobic  metabolism  to  the  overall  metabolic  demand  of  the  activity  can 
be  estimated  by  measuring  the  VO2  (see  Oxygen  Uptake,  below)  during  task  performance  and  dur¬ 
ing  the  recovery  period,  for  say  IS  minutes.  From  a  plot  of  the  time  versus  VO2  as  shown  in  Figure 
3.1,  the  area  under  the  curve  can  be  calculated  by  numerical  integration,  and  resting  VO2  sub¬ 
tracted.  The  relative  contribution  of  aerobic  and  anaerobic  metabolism  can  then  be  calculated  by 
comparing  the  areas  under  the  curve  during  the  work  and  recovery  periods. 

This  cycle  of  exercise,  fatigue,  and  recovery  depends  on  the  intensity  and  duration  of  the  activity 
and  recovery  periods  and  the  fitness  of  the  employee,  so  it  is  vital  that  all  of  these  aspects  of  per¬ 
formance  are  appropriatelycontrolled  and  monitored.  All  too  often  published  papers  on  the  physical 
demands  of  specific  occupations  fail  to  document  how  participants  were  selected,  how  representative 
participants  were  of  the  general  work  force,  and  how  the  rate  of  work  was  established  and  controlled. 
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Figure  3. 1 1llustration  of  the  oxygen  deficit  at  the  onset  of  activity  and  the  excess  post-activity  oxygen  consump¬ 
tion  (EPOC),  indicating  the  interaction  of  aerobic  and  anaerobic  energy  pathways  required  to  meet  energy 
demands.  Reprinted  by  permission  from!  H.  Wilmore&  D.  L.  Costill,  1999,  Physiology  Sport  and  Exercise.  2"1 
ed.  (Champaign,  IL:  Human  Kinetics),  135. 

Oxygen  Uptake  —  Energy  expenditure  during  dynamic  work  is  usually  expressed  as  a  rate  (i.e.,  kj.s 1 
of  VO2;  where  20.6  kilojoules  equals  1  liter  of  oxygen).  Depending  on  the  mode  of  activity,  it  is 
sometimes  more  appropriate  to  adjust  the  rate  of  work  by  body  mass  (e.g.,  VO2  in 
ml. kg'1. min.).  Direct  assessment  of  the  amount  of  energy  produced  by  the  body  is  not  normally 
possible,  so  the  indirect  assessment  of  energy  cost  via  the  measurement  of  VO2  is  often  used  as  the 
next  best  alternative.  VO2  during  aerobic  exercise  is  directly  proportional  to  energy  expenditure. 
The  mechanical  efficiency  (the  external  work  produced  divided  by  the  total  energy  produced)  varies 
somewhat  between  persons,  so  the  more  complex  and  skilled  the  task  is,  the  greater  the  variation 
will  be.  For  common  tasks  such  as  walking  and  running  there  is  less  interindividual  variation,  and 
therefore  the  measurement  of  small  numbers  of  employees  normally  suffices.  For  the  more  skilled 
and  complex  tasks,  greater  numbers  may  be  required.  Where  there  is  a  significant  thermal  load, 
static  components  to  the  task,  or  anaerobic  contribution,  additional  measurements  are  required  to 
fully  encapsulate  the  physiological  demands  (e.g.,  HR.  lactate,  or  body  temperature). 

Obtaining  V02  data  for  specifrcjobs  and  tasks  enables  the  most  aerobically  demanding  tasks 
to  be  identified,  quantifies  that  demand,  and  enables  any  job  modification  to  be  evaluated.  A 
description  of  the  actual  measurement  of  oxygen  uptake  is  beyond  the  scope  of  this  section,  though 
further  details  may  be  found  in  all  exercise  or  work  physiology  textbooks  (e.g.,  Astrand  6c  Rodahl, 
1986;  Wilmore  6c  Costill.  1994;McArdle,  Katch  6c  Katch,  1991 ). Portable  gas  analyzers  designed 
with  field  use  in  mind  are  now  widely  available.  Most  are  lightweight  units  attached  to  the  patients 
in  a  harness.  The  units  typically  contain  oxygen  and  carbon  dioxide  analyzers,  a  sampling  pump, 
barometric  sensors,  and  battery  power  supply.The  carried  units  either  contain  a  data  logger  to  store 
the  respiratory  data  or  they  have  a  transmitter  that  conveys  the  data  in  near  real  time  to  a  nearby 
personal  computer.  Although  these  portable  analyzers  greatly  ease  the  process  of  performing  indi¬ 
rect  calorimetry  on  personnel  in  a  work  setting,  they  are  expensive  to  purchase  and  run,  require 
expert  knowledge  to  be  deployed  effectively,  and  are  not  always  compatible  with  rugged  field  use. 
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Wilson  60  Corlett  ( 1 995)  have  proposed  a  work-intensity  classification  system  based  on  the  VO2. 
Tasks  that  use  a  V  02  of  more  than  2  liters  per  minute  (Linin'1)  are  classified  as  “extremely  heavy”  and 
those  using  1.5-2. 01, min'1  as  “veryheavy.”  Tasks  using  1 .0- 1 .5  Linin'1  are  considered  “heavy ,”0.5- 1.0 
l.min'1  as  “moderate,”and  less  than  0.5  as  “light.”However,  although  this  classification  system  cate- 
gorizesjobs  according  to  their  aerobic  stress,  it  fails  to  take  into  account  the  age,  sex,  and  physical  fit¬ 
ness  of  the  employee.  These  factors  collectively  wi  1 1  determine  the  strain  on  the  individual  employees. 

Wilson  6c  Corlett  (1995)  also  provide  examples  of  oxygen  uptakes  during  a  number  of  occupa¬ 
tional  categories.  Assembly  work,  driving,  and  office  work  are  reported  to  have  typical  oxygen 
uptakes  of  0.3-0.6  l.min  L  Nursing,  catering,  and  light  manufacturing  have  typical  oxygen  uptakes 
of  0.6-1. 0  l.min'1.  Heavy  cleaning  and  manufacturing  are  cited  as  having  oxygen  uptakes  of  0.8-1. 5 
l.min heavy  industrial  work,  heavy  gardening,  and  agriculture  have  oxygen  uptake  of  15-2.0  Lmin"1; 
and  firefighting,  manual  work  in  forestry,  and  mining  have  oxygen  uptake  of  2.0-3. 0  l.min'1. 

All  activities  (leisure  and  work)  and  the  intensity  of  the  activities  can  also  be  classified  accord¬ 
ing  to  their  metabolic  equivalent.  One  metabolic  equivalent  (MET)  equates  to  resting  metabolic 
rate,  which  in  turn  approximates  a  V02  of  3.5  mlCkg'Lmin.  In  their  textbook,  Wilmore  6c Costill 
(1994,  p.  523)  cite  the  MET  values  typically  associated  with  a  number  of  occupational  tasks.  For 
example,  sitting  at  a  desk  is  allocated  1.5,  bricklaying  and  plastering  3.5,  and  digging  7 . 5 .  Ho  we  ver, 
it  should  be  understood  that  these  figures  are  mean  values  and  they  fail  to  take  into  account  varia¬ 
tions  in  the  rate  of  work  or  individual  variations  in  efficiency. 

The  American  College  of  Sports  Medicine  (ACSM)  Position  Stand  (American  College  of 
Sports  Medicine,  1998)  classifies  physical  activity  intensity  based  on  physical  activity  lasting  up  to 
60  minutes  in  METS  by  age  category.  The  figures  are  presented  in  Table  3.1. 


Table  3.1  ACSM’s  Classificationof  physical  activity  intensity,  based  on  physical  activity  lasting  up  to60  minutes 


Endurance-type  activity 

r  Resistance-type  exercise 

Relative  Intensity 

Absolute  intensity  (METs)  in  healthy  adults  (age  in  years) 

Relative  Intensity 

Intensity 

V02  (%> 

heart  rate 

reserve  (%) 

Maximum  Heart 

Rate(%) 

RPEt 

Young 

(20-39yr) 

Middle-aged 
(40-54  yr) 

Old 

(65— 79yr) 

Very  Old 
(80  +  yr) 

Maximal 

voluntary 
contraction  (%) 

Very  Light 

<20 

<35 

<10 

<2.4 

<2.0 

<1.6 

<1.0 

<30 

Light 

20-39 

35-54 

10-11 

2.4-4.7 

2.C-3.8 

1. 6-3.1 

1.1-1 .9 

30-^9 

Moderate 

40-59 

55-69 

12-13 

4.8-7.1 

4.C-5.8 

3. 2-4. 7 

2.0-2.9 

50-69 

Hard 

60-64 

70-89 

14-16 

7,2-10,1 

4.8-6.7 

3.0-4.25 

70-84 

Very  Hard 

1 

>55 

>90  | 

j  17-19  | 

>10.2 

■ 

26.8 

24.25 

>85 

Maximum 

100 

100 

20 

12.0 

10.0 

5.0 

100 

Based  on  8-1 2  repetitionsfor  persons  under  age  50-60  years  and  1 0-1 5  repetitions  for  persons  aged  50-60  years  and  older. 

-  Borg  rating  of  Perceived  Exertion  6-20  scale  (Borg,  1 982)  (24). 

-  Maximal  values  are  mean  values  achieved  during  maximal  exercise  by  heakhy  adults.  Absolute  intensity  (METs)  values  are  approximate  mean  values  for  men.  Mean  val¬ 


ues  for  women  are  approximately  1  -2  METs  lower  than  those  lor  men;  VO2  =  oxygen  uptake  reserve. 

-  Adapted  from  and  reprinted  with  permissionfrom  U.S.  Departmentof  Health  and  Human  Services:  Physical  Activity  and  Heakh:  A  Report  of  the  Surgeon  General. 
Atlanta:  U.S.  Departmentof  Health  and  Human  Services,  Centers  for  Disease  Control  and  Prevention,  National  Center  for  Chronic  Disease  Prevention  and  Health 
Promotion,  1996.33. 
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We  have  measured  mean  V02  values  from  2  to  3  Lmin'1  for  a  number  of  tasks  performed  by 
British  Army  personnel  and  values  from  2.7  to  3.1  lmin'1  in  Royal  Navy  personnel.  The  highest 
values  in  army  personnel  were  recorded  in  a  group  of  Infantry  assaulting  an  enemy  position  (2.92 
l.min'1),  among  Royal  Engineers  bridge -building.  Royal  Artillery  personnel  loading  and  firing 
artillery  ammunition,  and  Royal  Armoured  Corps  changing  tank  tracks  (all  around  2.2-23  l.min'1) 
(Rayson,  1998).  Atl  investigation  of  ship-board  firefighting  performed  by  naval  personnel  found 
drum  carrying  to  be  the  most  aerobically  demanding  task,  requiring  a  mean  of  3.1  l.min'1  (Bilzon, 
Scarpello,  Smith,  Ravenhill,  &  Rayson),  albeit  for  short  periods  of  several  minutes  only. 

The  oxygen  requirement  can  also  be  related  to  the  individual  employee’s  maximal  aerobic  power 
(VC^max).  To  expose  the  relevance  of  different  levels  of  fitness  (which  are  strongly  age  and  gen¬ 
der  related)  to  the  cardiovascular  strain  on  an  individual  employee,  take  an  example  of  employee  A 
whose  VC^max  is  4  l.min'1  (typical  of  a  young  male  soldier)  and  employee  B  whose  VC^max  is  2 
l.min1  (typical  of  an  older  female  soldier). To  perform  a  task  requiring  a  V  0  2  of  limin'1  requires 
only  25%  VC^max  for  soldier  A,  whereas  it  requires  50%  VC^max  for  soldier  B.  It  is  thus  the  per¬ 
centage  of  an  employee’s  VC^max  that  a  task  demands,  together  with  its  duration  that  determines 
whether  an  employee  can  perform  the  job. 

Astrand  and  Rodahl  (1986)  along  with  many  other  authors  have  estimated  that  the  energy 
demands  of  an  8-hour  day  should  not  exceed  30  to  40%  of  VC^max.  Around  50%  VC^max  may 
be  sustainable  by  fit  individuals  for  up  to  2  hours,  75%  for  up  to  one  hour,  and  100%  for  several 
minutes.  Knowing  the  duration  of  the  task  is  as  important  as  quantifying  the  intensity.  Returning 
to  our  example,  although  an  intensity  of  25%  VC^max  should  be  tolerable  for  an  8-hour  day  for 
soldier  A,  an  intensity  of  50%  VC^max  would  be  unsustainable  for  soldier  B. 

Another  important  consideration  that  is  often  overlooked  is  the  specificity  of  HR  and  V  0  2 
data  to  a  particular  mode  of  activity.  Too  often  in  the  published  literature,  authors  mix  data  from 
different  modes  of  activity  (e.g.,  the  VC>2  during  a  manual  handling  task  expressed  as  a  %VC>2max 
from  treadmill  running).  Data  from  Petrofsky  Sc  Lind  (1978)  comparing  VC^max  of  cycling  and 
materials  handling  and  more  recent  data  of  our  own  comparing  the  two  ostensibly  similar  activi¬ 
ties  of  running  and  marching  (Rayson,  Bell,  Davies  Sc  Rhodes-James,  1995)  show  that  the  rela¬ 
tionships  between  HR,  VC>2,  and  activity  duration  are  activity  specific  and  should  not  be  used 
interchangeably. For  example, it  would  be  dangerous  to  assume  that  because  a  person's  VC^max  on 
the  treadmill  is  4  l.min'1  he  could  sustain  a  V  02  of  1.2- 1.6  l.min'1  (30-40%VC>2max)  all  day  per¬ 
forming  material  handling  tasks.  The  VC^max  for  material  handling  tasks  (or  more  strictly  the 
VC>2peak)  and  hence  the  maximum  sustainable  level  of  performance  for  this  type  of  work  would 
most  probably  be  considerably  lower. 

Heart  Rate  —  Unlike  VC^,  which  is  a  measure  of  cardiovascular  stress,  heart  rate  (HR)is  a  meas¬ 
ure  of  cardiovascular  strain.  It  is  a  composite  index  reflecting  the  psychological  and  thermal  loads 
as  well  as  the  physical  demands  of  the  activity,  and  therefore  any  interpretation  of  HR  data  must 
consider  these  other  influencing  factors.  As  with  VO2,  during  moderate-intensity  dynamic  exer¬ 
cise,  HR  increases  over  the  first  few  minutes  until  a  steady  state  or  plateau  is  reached.  Steady-state 
HR  is  linearly  related  to  workload  and  VC>2,  and  therefore  HR  may,  under  certain  circumstances, 
be  used  as  a  surrogate  for  VC^.  However,  this  should  only  be  done  if  the  relationship  between  HR 
and  VC>2  is  known  for  the  employees  under  investigation  during  a  similar  mode  of  activity.  The 
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gradients  of  the  HR  versus  V02  and  HR  versus  workload  lines  depend  on  fitness  as  well  as  the 
mode  of  activity — the  fitter  the  employee,  the  shallower  the  gradient  is. 

HR  is  relatively  simple  to  measure  using,  for  example,  over-the-counter  heart  rate  monitors 
comprising  a  chest  strap  that  detects  and  transmits  the  heart-rate  signal,  and  a  wrist  monitor  that 
receives  and  stores  the  data.  HR  monitors  can  be  programmed  to  store  data  at  predefined  intervals 
(e.g.,  every  5,  30,  or  60  seconds)  and  may  be  worn  by  employees  throughout  the  day,  with  little 
interference.  Given  that  HR  is  an  unspecific  measure  of  occupational  task  demands,  it  is  all  the 
more  important  to  monitor  and  record  peripheral  information  such  as  temperature  and  humidity, 
clothing,  and  so  forth  to  supplement  HR  data.  If  activity  data  are  collected  simultaneously,  peak 
and  mean  HR  can  be  calculated  for  different  tasks  performed  during  the  day. 

These  data  can  also  be  used  to  estimate  the  intensity  of  the  tasks  or  job,  in  terms  of  the  pro¬ 
portion  of  maximum  heart  rate  (HRmax)  (220-age)  or  heart  rate  reserve 
(  H  RR  =  ( HR  max  —  HRrest),  both  of  which  can  be  measured  or  estimated,  and  in  terms  of  car¬ 
diovascular  strain.  Wilson  &  Corlett  (1995)  suggest  that  HR  during  prolonged  work  of  up  to  90 
beats  per  minute  is  considered  indicative  of  light  strain,  90—110  as  moderate,  110— 130  as  heavy, 
130- 150  as  very  heavy,  and  150-170  as  extremely  heavy,  but  these  classifications  do  not  consider 
aging.  The  ACSM  Position  Stand  (ACSM,  1998)  classifies  physical  activity  intensity,  based  on 
physical  activity  lasting  up  to  60  minutes  as  %HRR,  as  presented  in  Table  3.1.  %HRR  is  calculat- 
ed  as  x  100. 

If  measuring  working  HR  is  not  possible,  it  might  still  be  possible  to  measure  recovery  HR. 
Recovery  HR  depends  on  the  HR  during  work  and  the  fitness  of  the  employee.  The  quicker  the 
recovery  period,  the  fitter  the  employee  is.  According  to  Brouha’s  method  (Brouha,  1960),  the 
employee  is  seated  immediately  after  activity  and  the  HR  is  measured  for  3  minutes.  The  number 
of  heart  beats  during  the  first,  second,  and  third  minutes  is  counted: 

1.  If  HRl  —  HR  3  >  10, or  ifHRl,  HR2,  and  HR3  are  all  below  90,  then  recovery  is  normal. 

2.  If  the  average  of  HRl  over  a  number  of  recordings  is  <  1 10,  and  HRl  —  HR3  >  10,the 
workload  is  not  excessive. 

3.  If  HRl  —  HR  3  <  10,  and  if  HR  3  >  90,  then  recovery  is  inadequate. 

In  Worked  Example  2  that  follows,  we  provide  an  illustration  of  how  HR  data  collected  in  the 
field  can  be  combined  with  laboratory  measurements  of  oxygen  uptake  to  estimate  the  cardiovas¬ 
cular  demands  of  a  task.  Minimum  acceptable  standards  of  aerobic  fitness  can  be  established  by  this 
method.  The  data  are  from  a  recent  Military  project  that  we  conducted  to  define  the  physical 
demands  of  rural  patrolling  in  soldiers. 

Body  Temperature  —  Body  temperature  is  normally  maintained  around  37’C,  but  during  strenuous 
exercise  or  in  extreme  conditions  of  heat  or  cold,  temperatures  can  fluctuate  outside  of  these  val¬ 
ues.  Body  temperature  reflects  a  fine  balance  between  heat  gain  and  heat  loss.  During  exercise.most 
of  the  energy  the  body  produces  is  converted  to  heat.  If  the  body’s  heat  production  exceeds  its  loss, 
core  temperature  rises.  Heat  loss  can  take  place  via  the  processes  of  radiation,  conduction,  convec¬ 
tion,  and  evaporation.  But  if  the  ambient  temperature  is  greater  than  the  body’s  temperature,  heat 
may  be  gained,  not  lost,  by  radiation,  conduction,  and  convection.  Similarly,  exposure  to  the  sun’s 
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Worked  Example  2 


We  measured  HR  over  a  5 -day  exercise  using  Polar  HR  monitors,  downloading  the  data 
at  the  end  of  each  working  day.  By  Wilson  Sc  Collett’s  criteria  (1995),  HR  during  most 
activities  in  the  majority  of  soldiers  equated  to  a  ‘moderate”  workload,  indicating  that  the 
workload  was  sustainable  without  undue  physical  fatigue.  However,  HR  in  a  minority  of 
soldiers  equated  to  a  “heavy”  workload  that  would  result  in  fatigue  and  impaired  perform¬ 
ance  over  time.  On  average  24%  ( 122  min),  8%  (41  min),  and  2%  (6  min)  of  time  on  patrol, 
were  spent  above  60%,  70%,  and  80%HRmax,  respectively.  Eight  percent  of  patrol  time  (41 
minutes)  was  spent  at  intensities  of  work  above  the  soldier's  anaerobic  threshold,  indicating 
the  importance  of  anaerobic,  as  well  as  aerobic  fitness  to  patrolling.  Two  of  the  five  days 
were  significantly  more  demanding  than  were  the  other  three.  Soldiers  within  the  platoon 
fulfilling  a  particular  function  and  carrying  particular  items  of  equipment  had  a  lower  HR 
than  the  remaining  soldiers  performing  all  other  roles.  Surprisingly,  these  individuals  were 
found  to  be  carrying  above-average  loads.  On  further  investigation,  high  levels  of  aerobic 
fitness  were  found  to  account  for  this  apparent  anomaly.  Without  the  additional  data  on 
weight  of  loads  and  fitness,  an  erroneous  conclusion  could  easily  have  been  made. 

We  also  used  the  same  HR  data  to  estimate  the  energy  cost  of  the  patrols  using  the  HR 
versus  V02  relationship  that  we  had  calculated  from  a  simulated  loaded  march  on  a  tread¬ 
mill  (i.e.,  a  near-identical  mode  of  activity  to  patrolling)  in  all  participants.  Steady-stateH  R 
and  V02  values  from  the  last  30  seconds  of  each  3 -minute  workload  of  the  treadmill  test 
were  regressed  to  form  a  linear  equation  in  the  form  of  V02  =  m  x  HR  +  c.  Resting  HR 
values  were  taken  as  the  minimum  HR  recorded  during  sleep.  This  value  was  plotted 
against  a  theoretical  resting  oxygen  uptake  value  of  3.5  ml.kg.  'min.1  ( 1  MET).  R2  values 
for  the  equations,  ranged  from  0.88  to  0.99  percent,  indicating  that  the  equations  fitted  the 
data  well  (i.e.,  88%  to  99%  of  the  variation  in  V02  could  be  accounted  for  by  HR). 

Using  the  daily  HR  data  and  the  individually  determined  HR  versus  V02  relationship, 
we  estimated  V02  for  each  soldier  throughout  the  day's  patrolling  activities. The  overall 
mean  oxygen  uptake  of  the  5  days  of  rural  patrols  was  1.10  Linin'1  (SD  0.46)  or 
15.0  ml.kg.^min.'1  (SD  6.4),  which  equates  to  a  “heavy”  workload  by  Wilson  Sc  Corlett's 
classification  ( 1995)  and  a  ‘‘light”intensity  by  ACSM’s  classification  for  the  young  (20-39 
years)  ( ACSM,  1998). The  mean  time  in  minutes  and  the  proportion  of  the  total  time  spent 
at  workloads  above  40%,  50%,  60%,  70%,  and  80%  of  individual  VC^max  were  also  calcu¬ 
lated.  The  soldiers  worked  at  an  overall  mean  %V02max  of  33%  (SD  6),  with  mean  values 
per  day  ranging  from  26%  (SD  6)  to  37%  (SD  7).  On  the  most  physically  demanding  day, 
50%  of  soldiers  performed  at  a  work  intensity  above  40%  VC^max  (the  suggested  maxi¬ 
mum  sustainable  intensity  of  work).  Occasional  fairly  brief  spurts  of  activity  that  were  very 
demanding  were  recorded.  Peak  oxygen  uptakes  during  the  most  demanding  10  minutes  of 
the  exercise  averaged  just  over  2  l.min'1.  Peak  oxygen  uptakes  during  the  most  demanding 
60  minutes  of  the  patrol  averaged  approximately  1 .601. min1.  On  the  basis  of  these  last  fig¬ 
ures  (60  min  at  1.6  l.min  ’)  we  proposed  minimum  standards  of  aerobic  fitness  of  3.2  l.min' 
based  on  the  premise  that  approximately50%VC>2peak  could  be  sustained  for  1  hour. 
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radiation  may  result  in  heat  gain  not  loss.  During  exercise  the  primary  route  of  heat  loss  is  evapo¬ 
ration  of  sweat,  though  this  too  may  be  impaired  by  clothing,  high  humidity,  and/or  dehydration. 

Prolonged  heavy  sweating,  the  body’s  normal  primary  heat  loss  mechanism  during  exercise  in 
ambient  conditions  can  also  result  in  fluid  loss  if  it  is  not  replaced  at  an  equivalent  rate.  Fluid  loss 
leads  to  reduced  blood  volume,  and  reduced  blood  volume  leads  to  reduced  cardiac  output,  curtail¬ 
ing  our  ability  to  continue  working  as  efficiently  and  to  dissipate  heat  as  effectively.  Heart  rate  and 
body  temperature  become  elevated  during  exercise  when  more  than  2%  of  body  mass  (fluid)  is  lost. 
Exercise  and  work  performance  decline  by  approximately  10%  at  2%body  mass  lost  and  25%  at  4% 
body  mass  lost  (Saltin  &  Costill,  1988,  in  Wilmore  8o  Costill,  1994). 

Strenuous  physical  work  in  hot  environments  imposes  large  stresses  and  strains  on  the  employee.  It 
is  important  as  a  minimum  to  monitor  ambient  conditions  during  ajob  analysis;  and  where  a  significant 
thermal  load  is  suspected,  measurements  of  deep  body  (core)  and  skin  temperature  should  be  made. 

Ambient  conditions  are  normally  monitored  by  measuring  the  wet  bulb  globe  temperature 
(WBGT) — a  weighted  composite  index  of  wet  bulb  temperature,  radiant  heat,  and  dry  bulb  tem¬ 
perature.  The  index  can  be  used  to  describe  conditions  in  which  work  is  performed.  It  can  also  be 
used  to  predict  the  risk  of  heat  injury  for  exercise  at  different  intensities.  Further  details  about  ther¬ 
mal  considerations  may  be  obtained  from  any  thermal  physiology  textbook  (e.g.,  Parsons,  1993) 
and  the  International  Standards  Organization  (ISO)  series  of  publications  (e.g.,  ISO,  1993). 

Body  temperature  can  be  assessed  in  numerous  ways,  including  oral,  aural,  esophageal,  and  rectal 
for  deep  body  temperature,  or  on  the  skin  for  peripheral  temperature.  Due  to  the  variation  in  the  tem¬ 
perature  of  different  tissues,  mean  temperature  is  sometimes  used.  Mean  temperature  is  calculated  as 
a  weighted  average  of  skin  and  internal  temperatures.  For  example  skin  thermistors  may  be  placed  on 
the  arm  (Ta),  trunk  (Tj.),  leg  (Tj),  and  head  (Tg)  and  skin  temperature  would  be  calculated  a  s 

Tskin  =  (0.1  x  To)  +  (0.6  x  Tt)  4-  (0.2  x  Ti)  +  (0.1  x  Th) 

The  constants  in  the  equation  represent  the  proportion  of  the  total  skin  area  represented  by  each 
region.  Once  core  temperature  (Tc)  has  been  measured,  mean  body  temperature  can  be  calculated  a  s 


Tbody  (0.4  x  Tskin )  +  (0.6  x  Tc) 


Worked  Example  3  describes  a  relatively  noninvasive  novel  technique  for  measuring  deep  body 
or  core  temperature  that  we  have  used  recently  as  part  of  ajob  analysis.  To  our  knowledge,  the 
device  is  not  commercially  available  currently,  but  it  offers  exciting  potential.  The  purpose  of 
including  this  physiological  technique  in  ourjob  analysis  was  to  ascertain  the  presence  and  extent 
of  any  hypothermic  and  hyperthermic  strain  in  soldiers  carrying  out  rural  patrols. 

Stress  hormones  —  Measuring  the  concentration  of  stress  hormones  in  the  body  provides  another 
avenue  for  assessingphysiological  ( and  psychological)  strain  as  high  levels  of  strain  may  be  reflect¬ 
ed  in  changes  in  endocrine  function.  Blood  levels  of  cortisol  may  increase,  while  thyroxine  and 
testosterone  levels  may  be  depressed  (Wilmore  <Sc  Costill,  1994).  Resting  blood  levels  of  epineph¬ 
rine  and  norepinephrine  may  also  be  elevated  resulting  in  raised  heart  rate  and  blood  pressure. 
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However,  collecting  of  these  blood  hormones  is  an  invasive  procedure,  and  measurement  is 
complex  and  expensive.  Further,  unless  baseline  data  are  obtained  under  normal  conditions  in  the 

Worked  Example  3 

We  used  a  Thermal  Monitoring  System  originally  designed  for  measuring  core  body 
temperature  in  deep-sea  divers  (Mekjavic,Tomsic,  Gider,  Golder,  8t Tipton,  1996;Tipton, 
Franks  6c  Golder,  1997). The  system  comprised  a  radio  pill,  data  logger,  and  two  tempera¬ 
ture  sensors. The  nonrecoverable  pill  (in  itself,  a  major  advance  relative  to  previous  recov¬ 
erable  pills ),  contains  a  blocking  oscillator  near- field  transmitter  powered  by  a  battery,  all  of 
which  is  encapsulated  in  medical  grade  epoxy  used  in  surgical  implants. 

The  radio  pill  is  ingested.  An  AM  receiver  monitors  the  radio  pill  emissions  and  the 
pulses  sampled  are  stored  in  the  random  access  memory  of  the  logger.  The  pulse  frequency 
is  converted  to  temperature  values  (“C)  based  on  a  calibration  equation  previously  derived 
for  each  pill.  The  logger  is  programmed  and  the  collected  data  retrieved  by  connecting  it  to 
a  personal  computer  using  Mini-Mitter  software. 

The  length  of  time  the  pill  remains  in  the  intestine  will  vary  depending  on  the  intes¬ 
tinal  motility,  which  is  influenced  by  the  diet,  but  typically,  the  pill  will  stay  in  the  body  for 
1  to  3  days.  Few  studies  have  used  a  radio  pill  to  measure  core  body  temperature,  therefore 
less  is  known  about  the  normal  expected  ranges.  However,  it  is  unlikely  that  the  tempera¬ 
ture  of  the  gastrointestinal  tract  will  differ  from  other  sites  used  to  measure  core  body  tem¬ 
perature  (rectal,  esophageal,  tympanic)  by  more  than  0,5°C, 

Core  body  temperatures  between  35°C  and  39°Care  unlikely  to  cause  any  serious  health 
problems  if  the  soldier  is  able  to  regulate  at  these  temperatures  and  is  not  at  the  extremes  for 
long  periods.  “Normaf’body  temperatures  throughout  the  day  in  cool  ambient  conditions 
(experienced  during  this  study)  and  low  exercise  intensities  are  between  36°C  and  38°C. 

In  women,  body  temperature  will  vary  with  the  menstrual  cycle.  Resting  and  exercising 
core  body  temperatures  are  higher  during  the  midluteal  compared  with  the  late  follicular 
phase  (Kolka  6c  Stephenson,  1997). In  addition,  women  appear  to  have  a  lower  heat  toler¬ 
ance,  a  higher  sweating  threshold  and  a  lower  sweating  capacity  than  men  (Fox,  Lofstedt, 
Woodward,  Eriksson,  6c  Werkstrom,  1969).  Fitness  will  also  affect  temperature  regulation 
and  may  mask  any  gender  differences. The  core  body  temperature  at  which  an  individual 
reaches  steady  state  during  exercise  depends  on  the  relative  rather  than  the  absolute  exer¬ 
cise  intensity.  Therefore  a  fitter  individual  will  have  a  lower  core  body  temperature  for  a 
given  work  load  (Saltin  6c  Hermansen  1966). 

An  example  core  body  temperature  trace  from  our  study  is  provided  in  Figure  3.2.  The 
relatively  high  starting  temperature  of38.2°C  at  approximately08.20  indicates  that  the  sol¬ 
dier  had  exercised,  had  been  in  a  hot  environment,  or  possibly  had  consumed  a  hot  drink 
before  the  measurements  started  (the  pill  would  have  been  in  the  stomach  at  this  stage  and 
may  have  been  in  physical  contact  with  the  hot  fluid).  The  first  increase  in  temperature 
occurred  between  09.30  and  1 1.00  and  corresponded  with  an  HR  averaging  ITTb.min'1. 
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The  plateau  in  the  temperature  from  10.00  to  1 1.00  indicates  that  the  soldier  was  able  to 
thermoregulate  with  a  steady  state  temperature  of  38.7°C.  Temperature  decreased  when  the 
exercise  ceased  at  1 1.00.  The  next  increase  in  temperature  corresponded  with  a  mean  HR 
of  ISSb.min1,  between  12.00  and  13.20.The  temperature  showed  no  signs  of  reaching  a 
plateau,  indicating  that  the  soldier  was  unable  to  thermoregulate  at  this  exercise  intensity. 
However,  when  the  soldier  stopped  at  13.20,  temperature  declined  rapidly. 

Collection  of  core  body  temperature  data  using  ingested  telemetry  pills  proved  socially 
acceptable  among  the  soldiers  and  viable  in  the  field  and  elicited  interesting  data.  This 
technology  provides  a  useful  measurement  technique  where  rectal  temperature  measure¬ 
ment  is  not  possible.  Three  of  our  nine  soldiers  (33%)  exhibited  core  body  temperatures  in 
excess  of  39°C,  the  generally  accepted  upper  limit  for  safe  operations.  There  was  no  evi¬ 
dence  found  of  these  soldiers'  ability  to  thermoregulate  at  this  temperature  (no  plateau  in 
the  trace  could  be  detected),  providing  some  cause  for  concern.  No  soldiers  exhibited  tem¬ 
peratures  below  35°C,  the  generally  accepted  lower  limit  for  safe  operations. 
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Time(hh:mm) 

figure  3.2  Trace  of  deep  body  temperature  throughout  a  day’s  patrolling  activity  in  a  soldier,  measured 
wia  an  ingested  telemetry  pill.  Reprinted  by  permission  from  ACSM,  7998  Classification  of  Physical 
Activity  Intensity  Position  Stand.  30.6,  978. 


specific  sample  of  Participants  involved  in  the  job  analysis,  the  findings  are  difficult  to  interpret,  as 
variabilityboth  within  and  between  people  can  be  considerable.  These  procedures  are  therefore  not 
normally  applicable  to  conducting  a  Physical  Demands  Analysis,  unless  the  demands  are  extreme¬ 
ly  high  and  a  state  of  overtraining  or  exhaustion  in  the  work  force  is  suspected. 

We  have  found  salivary  cortisol  to  be  a  potentially  useful  marker  of  stress  as  it  is  can  be  obtained 
by  noninvasive  methods.  During  acute  exercise  or  stress,  only  a  minor  increase  in  cortisol  is  found, 
but  during  chronic  exercise  or  stress,  levels  can  increase  by  150percent  in  30  minutes.  Cortisol  lev¬ 
els  show  a  circadian  pattern,  generally  peaking  in  the  morning  around  breakfast  time,  reaching  the 
lowest  values  during  late  morning,  and  then  rising  gradually  throughout  the  afternoon,  evening, 
and  night  (Niernan,  1996),  so  obtaining  baseline  data  in  all  participants  on  a  “normal” nonworking 
day  is  important.  However,  investigating  salivary  cortisol  still  requires  access  to  biochemistry  labo- 
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ratories  for  analysis.  In  addition,  interpretating  the  data  remains  challenging  as  threshold  values 
indicating  elevation  and  a  high  degree  of  strain  appear  to  be  quite  variable  between  individuals,  and 
accepted  normative  values  have  yet  to  be  established. 

Global  Positioning  System  and  energy  prediction  equations  — Recently,  we  have  been  experiment¬ 
ing  and  had  reasonable  success  with  using  Portable  Global  Positioning  Systems  (GPS)  to  collect 
positional  data  and  time  of  soldiers  in  the  field.  During  our  recent  study  on  patrolling,  each  soldier 
carried  a  GPS  in  the  top  of  his  rucksack  while  deployed.  The  post-processed  coordinates  were  fed 
into  an  AXIS  digital  mapping  system,  which  forms  part  of  a  concept  demonstrator  program. 
Vertical  distances  were  calculated  by  overlaying  the  GPS  data  onto  contoured  maps.  For  the 
Physical  Demands  Analysis,  we  calculated  the  mean  distance  covered  per  patrol  to  be  4.134  km, 
mean  speed  to  be  1.4  km/h,  with  63  meters  ofvertical  ascent  and  23  meters  ofvertical  descent  (i.e., 
the  noncircular  patrol  route  ended  at  an  altitude  40  meters  higher  than  at  the  start). 

From  this  information  we  estimated  the  energy  cost  of  the  patrols,  using  a  formula  developed  by 
Pandolf,  Givoni  &  Goldman  (1977).  This  approach  provides  an  alternative  method  to  estimating  the 
aerobic  demand  of  an  activity  in  the  field  without  measuring  any  physiological  data  (e.g.,  HR  and  VO2). 

Metabolic  cost  of  walking  (watts)  = 

1.5W  +[2(W  +L))  x  [(£)2  +T(W  +  L)\  x  [(1.5V2  +0.35VG)] 

Where  : 

W  -body  mass  (kg) 

L=locid  mass  (kg) 

T  =  terrain  factor 
V  =  velocity  (m/s) 

G=  gradient  (%) 

The  Pandolf  equation  has  been  compared  with  observed  data  by  a  number  of  authors.  It  has 
been  found  to  predict  slightly  high  for  standing  with  loads  and  low  for  walking  at  slow  speeds  on 
both  grade  and  level  walking  (Pimental  6c  Pandolf  1979;  Pimental,  Shapiro  6c  Pandolf,  1982). 
Duggan  6c  Ramsay  ( 1987)  reported  predictions  averaged  3%  too  high,  but  generally  reported  good 
agreement  with  measured  values  while  walking  at  6  km.hr'1  on  the  level  with  and  without  a  2 1  kg 
load.  However,  the  equation  is  not  valid  for  downhill  walking,  which  limits  its  application.  Epstein. 
Stroschein  6c  Pandolf  ( 1987)  developed  an  equation  for  predicting  the  metabolic  cost  of  running 
with  and  without  backpack  loads. 


Biomechanical  Approach 

The  biomechanical  approach  scrutinizes  the  forces  exerted  on  and  by  the  body  during  work.  It 
is  often  used  to  analyze  postures  and  the  support  and  movement  of  loads  and  is  therefore  particu- 
larlyuseful  in  assessing  most  material  handling  tasks. Thebody  is  viewed  as  a  system  of  levers  and 
joints  in  which  the  levers  are  rotated  around  the  joint  by  the  action  of  skeletal  muscles.  The  mus- 
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cles  attach  close  to  thejoint,  allowing  a  small  contractile  distance  of  the  muscle  to  be  transformed 
into  large  movements  of  the  distal  end  of  the  lever.  The  mechanical  advantage  of  the  loads  at  the 
distal  end  of  the  lever  over  the  muscles  result  in  the  generation  of  large  muscle  forces  to  overcome 
relatively  small  loads  (Ayoub  6t  Mital,  1989). 

Biomechanics  is  a  useful  tool  in  our  repertoire  for  conducting  a  Physical  Demands  Analysis.lt 
is  particularly  useful  for  comparative  studies  in  which  different  conditions  or  methods  for  per¬ 
forming  a  task  are  compared,  rather  than  for  precise  quantification  of  workload.  Some  of  the  more 
useful  biomechanical  methods  available  to  us  include  posture  recordings,  measurements  of  force 
and  maximum  voluntary  contractions  (MVC),  measurements  of  muscular  activity  via  electromyo¬ 
graphy  (EMG),  and  estimates  from  biomechanical  analyses. 

Posture — The  recording  and  analysis  of  posture  during  task  performance  may  be  useful  for  several  rea¬ 
sons,  including  descriptive  purposes  and  to  identify  any  health  and  safety  issues.  Depending  on  the 
accuracy  required,  images  from  photographs  or  videos  may  suffice  for  recording  postures  and  calculat¬ 
ing  approximate  body  segment  angles.  Two  synchronised  images  taken  at  90'  to  each  other  provide 
additional  information,  but  there  remain  inaccuracies  caused  by  parallax.  For  a  biomechanical  analysis 
greater  accruacy  of  data  is  ideally  required,  but  for  assessing  posture  in  relation  to  discomfort,  strain, 
stability, or  force  exertions,  simple  observational  methods  are  adequate  ( Wilson  6c  Corlett,  1995). 

The  OWAS  method  is  a  posture  coding  system  designed  in  Finland  for  industrial  use  (Finnish 
Institute  of  Occupational  Health,  1992).  Postures  are  observed  and  recorded  using  a  recording 
sheet  and  each  posture  is  assessed  for  acceptability  using  an  assessment  sheet.  A  six-figure  OWAS 
code  is  generated:  the  first  three  figures  record  the  posture,  the  fourth  figure  indicates  the  force,  and 
the  fifth  and  sixth  figures  indicate  the  task  being  performed.  The  recommended  method  is  to 
glance  at  the  work  at  predefined  sampling  intervals  and  then  to  look  away  and  record  the  data. 
From  these  samples,  estimates  of  the  proportion  of  time  spent  in  different  postures  and  the  forces 
exerted  can  be  estimated.  Although  the  precision  of  the  data  is  low,  the  method  enables  rapid  iden¬ 
tification  of  the  major  inadequate  postures  during  force  exertion.  McAtamney  6c  Corlett  (1993) 
have  developed  an  analogous  procedure  to  assess  exposure  of  employees  to  the  risk  of  upper  limb 
disorder  called  Rapid  Upper  Limb  Assessment  (RULA). 

T  o  improve  accuracy  over  direct  observational  methods  of  posture,  goniometers  can  be  used  to 
measure  angles  between  body  segments  or  between  body  segments  and  the  vertical.  Where  spinal 
measures  are  made,  Corlett  suggests  using  the  goniometer  at  L  3  to  L5  to  estimate  the  angle  at  the 
lumbar-sacrum  junction,  and  on  the  lower  part  of  the  thoracic  spine  to  estimate  spinal  angle. 

Technological  developments  have  led  to  the  production  and  widespread  availability  of  comput¬ 
erised  electronic  devices  for  measuring  spinal  motion  (Marras,  Ferguson  6 C  Simon,  1990).  The 
devices  are  secured  to  the  body  at  the  chest  and  hips.  Velocity  and  acceleration  as  well  as  motion 
are  calculated  by  the  software. 

There  are  now  a  number  of  commercial  systems  on  the  market  that  use  multiple  video  cameras 
and  sophisticated  computer  software  to  track  markers  placed  at  strategic  points  on  subjects  per¬ 
forming  their  work  activities.  Although  these  advances  ease  the  process  of  capturing  work  activity, 
it  remains  a  long  and  tedious  process  to  extract  and  analyse  the  data.  Estimates  of  analysis  time  of 
up  to  10  times  that  of  recorded  time  are  not  unusual.  Mainly  for  this  reason,  simpler  more  direct 
measurements  are  often  favored  in  a  work  setting. 


88 


Chapter  3:  Job  Analysis 


Force  measurement — Given  the  logarithmic  relationship  between  the  amount  of  time  a  force  can 
be  sustained  and  the  proportion  of  MVC  that  this  force  represents,  it  is  often  useful  in  tasks  that 
involve  a  significant  force  component  to  measure  both  the  forces  involved  and  the  MV  C  in  the  rel¬ 
evant  muscle  groups  and  postures  in  employees. 

Measuring  the  loads  and  forces  involved  in  work  tasks  can  often  be  done  quite  simply  by  weigh¬ 
ing  any  objects  to  be  lifted  and  carried  and  by  inserting  a  force  transducer  between  the  employee 
and  the  object  in  the  case  of  pushing  and  pulling  tasks.  In  the  case  of  tasks  that  involve  movement, 
initial  forces  vilL  often  be  greater  than  sustained  forces  since  they  viH  reflect  the  extra  force 
required  to  overcome  inertia  to  start  the  object  in  motion.  The  greater  the  acceleration  is,  the 
greater  the  initial  force  viH  be.  The  speed  and  acceleration  of  movements  by  employees  therefore 
have  an  impact  on  the  forces  exerted,  and  efforts  should  be  made  to  control  these  parameters. 
Employees  often  perform  tasks  with  additional  vigor  when  performing  for  the  investigator's  bene¬ 
fit.  Investigators  should  be  aware  of  this  tendency  and  take  steps  to  counter  it. 

If  we  calculate  the  percentage  that  the  measured  force  represents  as  a  proportion  of  an  employ¬ 
ee’s  MV C  for  that  muscle  group  and  posture,  and  we  know  its  duration  and  frequency, we  canjudge 
its  acceptability  by  reference  to  accepted  threshold  values.  We  could  also  estimate  the  maximum 
duration  for  which  a  given  force  could  be  sustained  in  all  employees  for  whom  we  have  an  MVC, 
but  that  information  is  less  useful. 

An  alternative  approach  that  is  commonly  adopted  in  developing  occupational  fitness  standards  is 
to  measure  the  static  or  dynamic  strength  of  relevant  muscle  groups  and  to  relate  these  to  task  per- 
fonnance  using  regression  equations  (the  criterion  validity  approach). This  approach  has  been  adopt¬ 
ed  by  many  authors  usually  with  reasonable  success  (e.g.,  Poulsen,  1970;  Sharp  et  al.,  1980;  Pytel  & 
Kamon,  1981;Ayoubetal.,  1982;Teves,  Wright  &  Vogel,  1 985;  Nottrodt  &  Celentano,  1987;Beckett 
8c  Hodgdon,  1987 ;  Dueker.  Ritchie,  Knox  8c  Rose,  1994;  Rayson,  Holliman  8c  Belyavin,  2000). 

In  these  studies,  static  strength  tests  that  were  strongly  associated  with  lift  performance  includ¬ 
ed  upright  pull,  back,  arm,  shoulder,  and  leg  strength.  The  most  strongly  correlated  dynamic 
strength  and  power  tests  with  lift  performance  included  the  Incremental  Lift  Machine  test,  bench 
press,  vertical  and  broad  jump,  and  isokinetic  back  extension  and  isokinetic  lift  power.  The  highest 
correlation  coefficients  for  men  and  women  separately  were  reported  by  Pytel  8c  Kamon  (1981) 
between  isokinetic  strength  scores  and  lift  scores  (r  =  0.87-0.96).  However,  more  typically  r- values 
of  0.3-0. 5  were  reported  for  single  sex  data  (e.g.,  Myers,  Gebhardt,  Crump,  8c  Fleishman.  1984; 
Teves,  Wright  8c  Vogel,  1985).  Usually  the  r-values  were  higher  for  men  than  for  women  (e.g., 
Myers  et  al.,  1984;Teves  et  al.,  1985). 

To  improve  predictive  capability  many  studies  build  multiple  regression  models  combining  sev¬ 
eral  strength  and  anthropometric  variables  in  the  prediction  equation.  R2  values  ranged  from  0.33 
and  0.11  for  men  and  women  respectively  (Teves  et  al.,  1985)  up  to  0.94  in  a  pooled  gender  sample 
(Pytel  and  Kamon.  1981).  A  number  of  studies  pooled  the  male  and  female  data  apparentlywithout 
first  checking  the  validity  of  this  technique  in  their  population — this  practice  should  be  avoided. 

Electromyography — EMG  is  a  technique  for  measuring  electrical  activity  of  the  muscle.  It  can  be 
used  to  assess  the  involvement  of  a  muscle  group  in  a  particular  movement  or  task,  the  extent  of 
the  muscle  involvement,  and  the  state  of  fatigue  of  the  muscle  (Hagberg,  1981 ). Needle  electrodes 
inserted  into  the  muscle  is  the  preferred  technology,  but  usually  in  an  occupational  setting  nonin- 
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vasive  surface  electrodes  are  used  over  the  belly  of  the  muscle.  The  signal  is  amplified  and  normal¬ 
ly  recorded  as  an  electronic  file  for  later  analysis. 

The  presence  or  absence  of  electrical  activity  indicates  the  active  involvment  or  otherwise  of  a 
particular  muscle  group.  Further,  the  consistent  exponential  relationship  between  EMG  activity 
and  force  during  both  static  and  dynamic  activities  (Hagberg,  1981)  enables  the  user  to  establish 
the  extent  of  involvement  of  muscle  groups  and  estimate  the  forces  involved.  To  perform  this  type 
of  analysis,  EMG  response  to  an  employee’s  MVC  in  exacdy  the  same  posture  also  needs  to  be 
measured.  Without  maximal  data  on  the  employees  under  investigation,  there  is  no  means  of 
anchoring  and  interpreting  the  response,  other  than  in  qualitative  terms. 

These  data  can  be  especially  useful  in  isolating  the  role  of  specific  muscle  groups  in  complex 
activities  such  as  loaded  marching.  For  example,  under  relatively  light  backpack  loads  the  electrical 
activity  of  the  erector  spinae  (back  extensors)  is  lower  than  with  no  load,  while  under  heavy  load, 
the  muscle  group  is  clearly  more  active  than  with  no  load  (Knapik,  Harman  So  Reynolds,  1996). 
The  gastrocnemius  shows  similar  increases  in  EMG  activity  with  load,  indicating  an  increased 
demand  on  this  muscle  group. 

Fatigue  of  muscle  can  also  be  detected  and  estimated  via  EMG.  Fatigue  exhibits  itself  as  an 
increase  in  the  amplitude  in  the  low  frequency  range  of  EMG  activity  and  also  as  a  shift  in  the  fre¬ 
quencies  toward  the  lower  end  of  the  spectrum. 

Biomechanical  models — Once  forces  and  torques  have  been  calculated,  we  can  start  to  make  com¬ 
parisons  between  tasks  and  within  tasks  under  various  conditions.  We  can  also  assess  the  feasibili¬ 
ty  and  safety  of  tasks  by  comparing  the  data  with  appropriate  population  norms  and  with  occupa¬ 
tional  safety  and  health  guidelines.  Below,  we  review  the  National  Institute  for  Occupational  Safety 
and  Flealth  (NIOSFI)  equation — a  widely  used  tool — and  its  use  in  conducting  a  Physical 
Demands  Analysis.  To  cover  further  detail  on  these  topics,  refer  to  Ayoub  &  Mital  ( 1989)  for  an 
overview  of2D  and  3D.  static  and  dynamic,  biomechanical  models;  to  the  UK  Defence  Standards 
00-25  (Ministry  of  Defence,  1998);  and  to  Snook  &  Ciriello  (1991 )  for  Military  and  industrial 
population  norms,  respectively. 

NIOSH  published  guidelines  for  manual  lifting  in  1981  (National  Institute  of  Occupational 
Safety  and  Flealth,  1981),  which  provided  an  empirical  method  for  computing  a  load  limit  for  lift¬ 
ing.  The  guide  considered  the  epidemiologyof  musculoskeletal  injury,  and  set  biomechanical,  phys¬ 
iological,  and  psychologicallimits.  Its  application  was  limited  to  two-handed  symmetrical, smooth 
lifting  directly  in  front  of  the  body  using  handles.  Six  different  factors  are  used  to  determine  the 
action  limit  (AL) — the  object  load,  the  horizontal  distance  between  ankles  and  hands,  the  vertical 
distance  between  hands  and  floor,  the  vertical  travel  distance,  the  frequency,  and  the  duration  of  the 
lifting  task.  These  six  parameters  are  entered  into  the  NIOSFI  equation. 

The  AL  is  defined  as  the  load  that  can  be  safely  handled  by  75  percent  and  99  percent  of  women 
and  men,  respectively.  Thus  lifting  loads  that  fall  below  the  AL  are  considered  safe.  The  maximum 
permissible  limit  (MPL),  which  is  three  times  the  AL,  is  the  load  that  can  be  safely  handled  by  only 
25%  of  men  and  virtually  no  women.  Loads  that  fall  between  the  AL  and  the  MPL  pose  an 
increased  risk  for  some  workers.  Loads  above  the  MPL  pose  a  significant  risk  for  many  workers. 

In  1991  the  NIOSH  equation  was  revised  (Waters,  Putz- Anderson,  Garg  &  Fine.  1993)  to 
encompass  a  broader  range  of  tasks,  and  alterations  were  made  to  the  various  criteria.  Six  coeffi- 
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cients  were  used  to  reduce  the  load  constant  to  compensate  for  characteristics  of  the  lift  that  were 
different  from  the  standard  conditions. 

For  a  Physical  Demands  Analysis,  the  NIOSH  equation  serves  two  uses.  First,  it  can  be  used  to  pro¬ 
vide  an  index  of  acceptability  for  each  task  that  is  performed  by  the  work  force.  Second,  by  calculating 
a  ratio  between  the  weight  of  lift  and  the  actual  weight  of  lift,  we  can  obtain  an  index  of  task  difficulty. 

In  Worked  Example  4  an  illustration  is  provided  of  how  we  used  the  NIOSH  equation  and  the 
AL  and  MPL  to  classify  and  interpret  material  handling  tasks  performed  by  all  Career 
Employment  Groups  in  the  British  Army. 


Worked  Example  4 

As  part  of  a  Physical  Demands  Analysis  that  we  conducted  on  occupations  within  the 
British  Army  (Rayson,  1998),  we  submitted  the  data  relating  to  all  of  the  material  handling 
tasks  to  the  NIOSH  ( 1981 )  equation.  We  found  that  only  12%  of  tasks  were  “acceptable” — 
falling  below  AL,  while  67%  required  redesign — falling  between  AlL  and  MPL.  The 
remaining  21%  of  tasks  exceeded  MPL  —  these  were  considered  unsafe  to  perform  under 
any  circumstances  by  the  NIOSH  criteria.  Among  the  nine  tasks  that  fell  into  this  last  cat¬ 
egory,  four  exceeded  the  AL  by  fourfold. 


Psychophysical  Approach 

The  psychophysical  approach  assumes  that  both  biomechanical  and  physiological  stresses 
impinge  on  an  employee  performing  any  task  and  that  these  stresses  are  integrated  and  combined 
and  can  be  assessed  as  an  objective  measure  of  acceptable  demand  rate  of  repetitive  work  or  per¬ 
ceived  stress.  There  are  several  psychophysical  methods  that  may  be  useful  during  a  Physical 
Demands  Analysis,  including  obtaining  ratings  of  perceived  exertion  (RPE)  (Borg,  1982,  1985, 
1998)  and  perceived  effort  (Fleishman,  Gebhardt  &  Hogan,  1984),  and  using  the  Body  Map  for 
evaluating  body  part  discomfort  (Wilson  8c  Corlett,  1 995).  For  a  full  discussion  of  measurement  of 
psychological  demand  refer  to  the  textbook  A  Guide  to  Manual  Materials  Handling  by  Mital, 
Nicholson  8c  Ayoub  ( 1993). 

Borg’s  Rating  of  Perceived  Exertion  —  The  Borg  Scale  (Borg,  1998)  is  based  on  the  linear  rela¬ 
tionship  that  exists  between  workload,  heart  rate,  and  perceived  exertion.  In  the  original  scale, 
which  spans  6  (no  exertion  at  all)  through  20  (maximal  exertion),  the  ratings  approximately  corre¬ 
spond  to  heart  rate  divided  by  10.  Some  authors  also  use  an  alternative  nonlinear  10-point 
Category  Ratio  Scale  (the  CR-10),  but  this  version  appears  to  be  less  widely  used. 

The  scale  is  described  to  the  employee  before  the  activity  commences  and  is  presented  to  the 
employee  at  fixed  time  intervals  or  during  particular  target  activities  to  obtain  an  RPE.  The  RPEs 
may  provide  additional  subjective  information  to  accompanying  physiological  measurements  or 
they  may  be  used  instead  of  H  R  measurements  when  the  latter  are  not  feasible  for  whatever  rea¬ 
son.  Similar  Visual  Analogue  Scales  are  used  to  rate  pain. 
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Fleishman’s  Perceived  Effort  Index — Building  on  the  work  of  Borg  and  co-workers  in  Sweden. 
Fleishman  developed  a  7-point  effort  scale  intended  for  use  in  the  workplace.  On  the  early  version 
of  the  scale,  each  number  was  anchored  by  verbal  descriptions  (e.g.,  1  =  very,  very  light;  4  =  some¬ 
what  hard;  7  =  very,  very  hard).  Fleishman  found  that  both  job-experienced  and  inexperienced  men 
and  women  were  able  to  make  fine  distinctions  between  tasks  at  all  ranges  of  effort.  For  example, 
lifting  and  carrying  objects  weighing  85  to  100  pounds  (mean  rating  6.6)  and  laying  railroad  tracks 
(mean  rating  6.3)were  rated  as  the  most  physically  demanding  tasks  (Hogan  &  Fleishman,  1979). 
A  correlation  of  0.81  between  the  metabolic  cost  and  the  mean  ratings  for  30  tasks  demonstrated 
the  strength  of  the  relationship. 

The  scale  was  later  modified  by  replacing  the  adjectives  with  behavioral  task  anchors  of  high, 
medium,  and  low  effort  (e.g.,  operate  ajackhammer  (mean  5.91),  perform  light  welding  (mean 
3.27),  and  operate  a  calculator  (mean  1.08)  (Fleishman,  Gebhardt,  &Hogan.  1984). The  reliability 
and  validity  of  the  perceived  effort  index  led  the  authors  to  conclude  that  it  could  be  substituted  for 
actual  physiological  measurement  of  work  across  a  wide  variety  of  physically  demanding  tasks. 
Herein  lies  the  usefulness  and  potential  for  this  index  in  performing  a  Physical  Demands  Analysis. 

Body  Map — Wilson  and  Corlett's  Body  Map  (Wilson  &;Corlett,  1 995)  is  a  useful  tool  for  evaluat¬ 
ing  both  the  location  and  severity  of  body  part  discomfort.  As  displayed  in  Figure  3.3,  the  body  is 
divided  into  segments — the  actual  size  and  clustering  of  the  segments  can  be  modified  to  suit  the 
application.  The  scale  can  be  used  in  a  number  of  ways.  In  it  simplest  form,  employees  can  be  shown 
the  scale  during  or  after  performing  particular  work  tasks,  and  they  can  indicate  in  which  parts  of 
the  body  they  are  experiencingdiscomfort.If  more  time  is  available  for  measurement,  severity  of  dis¬ 
comfort  can  also  be  rated  using  a  5 -point  paper  scale  anchored  at  the  0  and  5  points  by  “no  dis¬ 
comfort'’ and  “extreme  discomfort.”  Care  should  be  taken  when  summarizing  the  data — calculating 
mean  ratings  may  not  be  appropriate  since  mean  values  will  mask  interesting  individual  differences 
that  might  be  linked  to  body  size,  particular  items  of  equipment  used,  age,  or  gender.  W e  have  found 
it  useful  to  use  two  body  maps  simultaneously.representing  the  front  and  back  of  the  body.  Fower 
back,  hamstring,  and  calf  discomfort,  for  example  can  then  be  more  precisely  reported. 
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Figure3.3  Copyright  1995,  Frm  Wilson  and  Corlett'sbodymapbyJ.  R.  Wilson&E.  N.  Corlett.  Reproducer 
by  permission  of  Taylor  &  Francis,  Inc.,  http://www.routledge-ny.com 


Summary 


In  this  chapter,  we  have  presented  and  discussed  the  relative  merits  of  a  number  of  approaches 
and  techniques  for  conducting  ajob  analysis.  These  have  included  some  industrial  psychological 
techniques  such  as  observation,  questionnaire,  and  interview  for  identifying  the  most  physically 
demanding  tasks.  Once  the  critical  and  most  frequently  performed  tasks  have  been  identified,  var¬ 
ious  physiological,  biomechanical,  and  psychophysical  techniques  can  be  deployed  to  quantify  the 
stress  and  strain  associated  with  these  tasks.  These  have  been  described  and  worked  examples  pro¬ 
vided  of  our  own  experience  of  their  application. 

The  physiological  techniques  reviewed  include  measurement  of  oxygen  uptake  to  estimate 
energy  expenditure,  heart  rate  to  assess  cardiovascular  strain,  body  temperature  to  investigate  ther¬ 
mal  strain,  hormones  to  quantify  stress  levels,  and  Global  Positioning  System  to  track  movement 
and  to  estimate  energy  expenditure.  The  biomechanical  techniques  presented  comprise  posture 
analysis  to  describe  body  position  and  to  identify  any  health  and  safety  issues,  force  measurement 
to  quantify  the  extent  of  the  forces  exerted,  electromyography  to  assess  muscle  involvement  and 
fatigue,  and  the  use  of  biomechanical  models  to  make  comparisons  between  tasks  and  within  tasks. 
The  psychophysical  techniques  encompass  subjective  rating  tools  such  as  Borg’s  Rating  of 
Perceived  Exertion,  Fleishman's  Perceived  Effort  Index,  and  Wilson  and  Corlett’s  Body  Map. 

The  selection  of  approach  and  techniques  by  the  investigator  will  depend  on  many  factors, 
including  the  job  or  task  under  investigation,  the  resources  and  time  available,  and  the  expertise  of 
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the  investigation  team.  Generally,  a  multidisciplinary  approach  performed  by  a  multiskilled  team  is 
preferred  because  it  is  more  likely  to  elicit  a  complete  and  balanced  output.  Time  should  be  taken 
by  the  investigating  team  to  reflect  and  to  discuss. 

In  conclusion,  conducting  ajob  analysis  to  identify  and  quantify  the  most  physically  demand¬ 
ing  key  tasks  in  ajob  is  a  complex  process  that  requires  considerable  investment  of  time,  money, 
and  effort.  Good  science  and  good  judgment  are  required  in  equal  measure.  The  result  is  a  solid 
foundation  on  which  to  base  selection,  training,  and  retention  fitness  criteria.  The  payback  will  be 
increased  productivity  through  improved  operational  effectiveness  and  reduced  injury. 
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Abstract 


This  chapter  reviews  the  two  general  types  of  tests  used  to  evaluate  a  person's  ability  to  do  phys¬ 
ically  demanding  work  basic  ability  tests  and  work  sample  tests.  The  basic  ability  tests  reviewed 
include  aerobic  fitness,  body  composition,  strength,  muscle  endurance,  and  flexibility.  The  common 
laboratory  and  field  tests  were  reviewed  and  evaluated.  W ork  sample  tests  are  designed  to  duplicate 
occupational  tasks.  The  weaknesses  and  strengths  of  work  sample  tests  are  discussed.  Research  con¬ 
firms  that  physically  demanding  work  sample  test  performance  largely  depends  on  aerobic  fitness, 
body  composition,  and  strength  to  varying  degrees.  Although  important,  flexibility,  balance,  and 
agility  are  less  likely  to  be  related  to  physically  demanding  tasks.  This  chapter  reviews  these  data. 
When  basic  ability  tests  are  highly  correlated  with  work  sample  tests,  one  test  administration 
option  is  to  replace  the  work  sample  test  with  a  basic  ability  test  or  combination  of  basic  ability 
tests.  The  final  section  of  the  chapter  is  a  brief  review  of  the  role  of  aerobic  fitness,  strength,  and 
body  composition  on  health  and  injury 


ntroduction 


The  types  of  tests  used  to  evaluate  one’s  fitness  to  perform  physically  demanding  work  tasks  can 
be  categorized  into  two  general  types:  physical  ability  tests  and  work  sample  tests.  Physical  ability 
tests  measure  the  basic  fitness  components  of  aerobic  capacity,  body  composition,  strength,  muscu¬ 
lar  endurance,  and  flexibility.  Physical  ability  tests  not  only  evaluate  a  person's  capacity  to  do 
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demanding  work  tasks  but  also  their  physical  fitness.  In  contrast,  work  sample  tests  just  evaluate  a 
person's  ability  to  perform  a  work  task  or  combinations  of  tasks. 

This  chapteris  an  overview  of  basic  ability  tests  that  measure  aerobic  capacity,body  composition, 
muscular  endurance,  strength,  and  flexibility.  Work  sample  tests  along  with  their  strengths  and  limi¬ 
tations  are  discussed.  Next,  the  relationship  between  basic  ability  tests  and  work  sample  tests  is  exam¬ 
ined.  Although  the  primary  objective  of  this  State  of-the-Art  Report  ( SOAR)  is  to  examine  the  role 
of  physical  fitness  on  performing  physically  demandingjobs,  fitness  also  has  a  health  promotion  com¬ 
ponent.  The  final  section  of  the  chapter  examines  the  role  of  fitness  on  health  and  risk  of  injury. 


Physical  Ability  Tests 


This  section  includes  the  basic  physical  ability  tests  that  measure  aerobic  fitness,  body  compo¬ 
sition,  strength,  muscular  endurance,  and  flexibility  as  well  as  an  overview  of  the  nature  and  types 
of  tests  used  to  measure  these  physical  abilities. 

Aerobic  Fitness 

Aerobic  fitness,  the  maximal  volume  of  oxygen  one  can  consume  during  exhausting  exercise 
( VC^max),  depends  on  several  factors  including  efficient  lungs,  heart,  and  blood  vessels;  the  qual¬ 
ity  and  quantity  of  blood  (red  blood  count  and  volume);  and  the  cellular  components  that  help  the 
body  use  oxygen  during  exercise.  Because  a  person's  ability  to  use  oxygen  during  exhaustive  work 
depends  on  these  factors,  VC^max  is  an  accepted  test  of  aerobic  fitness  and  an  indicator  of  subse¬ 
quent  exercise  capacity  (ACSM.  1990;  ACSM,  1991  ).Astrand  and  Rodahl  (Astrand  6c  Rodahl, 
1970)  consider  it  to  be  the  best  index  of  physical  fitness: 

During  prolonged  heavy  physical  work,  a  person?  performance  capacity  depends  largely  upon  his 
ability  to  take  up,  transport,  and  deliver  oxygen  to  working  muscle.  Subsequently,  the  maximal 
oxygen  uptake  isprobably  the  best  faboratory  measure  of  a  person?  physical fitness,  providing  the 
definition  of  physical fitness  is  restricted  t  o  theperson  ?  capacity for  prolonged  heavy  work.  (Astrand 
&  Rodahl,  1970,  p.  314) 

There  are  several  different  ways  to  categorize  and  contrast  aerobic  fitness  tests.  These  are — 

1.  faboratorytests  or  field  tests, 

2.  maximal  or  submaximal  tests,  and 

3.  tests  that  do  not  involve  exercise,  that  is,  nonexercise  tests  (Baumgartner  &  Jackson ,  1999). 

A  discussion  of  these  tests  follows. 
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Laboratory  VC^max Test — Laboratory  tests  involve  increasingpower  output  slowly  and  systematical¬ 
ly  from  a  resting  to  a  maximum  level.  Maximal  oxygen  uptake  is  the  maximum  volume  of  oxygen  a 
subject  uses  during  exhausting  exercise  (Mitchell&Blomqvist,  1971;  Mitchell.  Sproule,& Chapman, 
1 958;  Rowell,  Taylor,  8c Wang,  1 964  ).A  laboratory  test  involves  gradually  increasing  power  output  and 
measuring  expired  gases.  Cycle  ergometers  and  treadmills  regulate  power  output.  A  computer-con- 
trolled  metabolic  cart  measures  the  volume  of  oxygen  consumed  during  the  test  protocol.  Maximum 
aerobic  power,  or  VC^max,  is  the  maximal  volume  of  oxygen  one  can  consume  at  maximum  power 
output.  Laboratory  tests  require  use  of  expensive  equipment  and  trained  technicians.  For  this  reason, 
the  direct  measurement  of  VC^max  is  typically  done  in  research  and  hospital  settings  only. 

Although  the  textbook  definition  of  VC^max  is  the  point  at  which  an  increase  in  power  out¬ 
put  does  not  produce  an  increase  in  VC>2,  some  researchers  question  this  criterion.  Frequently,  a 
subject  will  not  reach  a  plateau  (Noakes,  1988;Noakes,  1997).  A  common  procedure  is  to  use  other 
criteria,  which  often  include  the  following — 

1.  voluntary  exhaustion, 

2.  a  respiratory  exchange  ratio  >  1 .0  or  >  1.  l,and 

3.  an  exercise  heart  >  90%  of  age-predicted  maximum  exercise  heart  rate. 

The  question  of  whether  a  person  reaches  true  VC^max  is  somewhat  controversial  (Howley  8c 
Bassett,  1997;  Noakes,  1997;  Noakes,  1998),  resulting  in  the  common  practice  of  using  the  term 
VC>2peak,  or  the  peak  level  reached,  rather  than  VC^max. 

MaximumTreadmill  Tests — Aerobic  fitness  is  measured  from  maximal  treadmill  time  following  a 
standard  treadmill  protocol  (Baumgartner &Jackson.  1999;  Ross  &Jackson,  1990). Most  treadmill 
tests  given  in  the  United  States  use  the  Bruce  protocol,  followed  by  the  Balke  (seebelow).Treadmill 
protocols  start  at  a  low  level  and  systematically  increase  power  output  by  increasing  either  treadmill 
speed  and  elevation  or  both .  T  h  e  longer  the  subj  ects  can  continue  to  exercise  at  the  increasing  power 
output,  the  higher  their  VC^max.  Since  treadmill  protocols  use  a  standard  method  to  increase  power 
output,  elapsed  time  to  reach  exhaustion  is  an  index  of  maximum  treadmill  power  output. 

The  method  used  to  estimate  VC^max  from  maximum  treadmill  time  involves  testing  a  person 
by  following  the  standard  Bruce  or  Balke  treadmill  protocol.  The  test  ends  at  the  subject' s  voluntary 
exhaustion.  Valid  regression  equations  provide  the  means  of  estimating  VC^max  (ml/kg/min)  from 
maximal  treadmill  time  (Bruce,  Kusumi,&  Hosmcr,  1973; Foster,  Jackson.  &Pollock,  1984;Pollock, 
Hickman,  8c  Kendrick,  1 976).  Since  each  treadmill  protocol  increases  power  output  at  different  rates, 
a  unique  equation  is  published  for  each  protocol.  The  reported  correlations  between  VC^max  meas¬ 
ured  direcdy,  and  maximal  treadmill  exercise  is  high,  ranging  from  0.88  to  0.97.  The  standard  error 
of  prediction  is  about  3  ml/kg/min.  Following  are  the  regression  equations  to  estimate  VC^max 
(ml/kg/min)  from  treadmill  time  in  minutes  (T)for  the  Balke  and  Bruce  treadmill  protocols. 

Balke  Treadmill  Test  (Pollocket  al.,  1976),  R  =  0.88  (1) 

V02max  ( ml/kg/min )  =  14.99+ (1.44  x  T ) 
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Bruce  Treadmill  Test  Healthy  Subjects  (Foster  et  al.,  1964),  R  =  0.97 
V02max  (ml / kg /min)  =  17.50-  (0.30  x  T  )+  (0.297  x  T2)  -  (0.0077  x  T3) 


(2) 


The  Bruce  protocol  is  a  nonlinear  equation  developed  with  both  healthy  subjects  and  cardiac 
patients  (Foster  et  al.,  1984). The  provided  Bruce  equation  is  for  persons  free  of  coronaryheart  dis¬ 
ease. The  VC^max  of  heart  patients  is  4.2  ml/kg/min  lower  than  healthy  patients  for  a  given  max¬ 
imum  treadmill  time. 

Submaximal  Aerobic  Fitness  Laboratory  Tests — Exercising  to  VC^max  is  physically  exhausting, 
time-consuming,  expensive,  and  requires  medical  supervision  when  testing  high-risk  subjects.  The 
laboratory  submaximal  tests  provide  a  less  accurate  but  easier  and  safer  method  of  estimating  aero¬ 
bic  fitness. The  measurement  objective  of  submaximal  tests  is  to  define  the  slope  of  a  person's  heart- 
rate  response  to  exercise  and  use  the  slope  to  estimate  VC^max  from  submaximal  parameters.The 
three  exercise  physiological  principles  that  are  the  foundation  of  submaximal  tests  are  as  follows — 

Heart  rate  (i.e.,  pulse  rate)  increases  in  direct  proportion  to  the  oxygen  used  during  aerobic  exercise. 

*  VC^max  is  reached  at  maximum  heart  rate. 

*  A  less  fit  person  will  have  a  higher  heart  rate  at  any  submaximal  level  than  someone  who  is 
more  aerobically  fit. 

Oxygen  uptake  (VO2)  at  any  level  of  exercise  is  the  product  of  cardiac  output  and  the  differ¬ 
ence  in  the  oxygen  content  of  the  arterial  and  venous  blood.  Cardiac  output,  the  volume  of  blood 
pumped  with  each  heart  beat,  is  the  product  of  heart  rate  and  stroke  volume.  Stroke  volume 
increases  early  in  exercise  and  stabilizes  at  about  45  percent  of  VC^max.  The  testing  goal  of  esti¬ 
mating  VC^max  from  submaximal  power  output  is  to  measure  heart  rate,  between  45  and  70  per¬ 
cent  (=  1 1 5  to  1 50b/min)  of  a  person's  VC^max.  Below  45  percent  of  VC^max,  stroke  volume  has 
not  leveled  off,  whereas  at  about  70  percent  VC^max,  exercise  is  likely  shifting  from  aerobic  to 
anaerobic.  Submaximal  tests  use  a  submaximal  aerobic  power  output.  Singlestage  and  multistage 
models  estimate  VC^max  from  submaximal  power  output  and  exercise  heart  rate  (Baumgartner  <5c 
Jackson.  1999;  Ross  &Jackson.  1990). 

The  multistage  exercise  test  requires  that  heart  rate  and  power  output  be  measured  at  two  or  more 
submaximal  levels  (Golding,  Meyers,  &  Sinning,  1989). These  data  points  project  to  maximal  heart 
rate,  which  estimates  aerobic  fitness.  The  multistage  model  is  the  procedure  used  for  the  popular 
YMCA  adult  fitness  test  (Golding  et  al.,  1989). The  YMCA  test  uses  a  cycle  ergometer  followinga 
branching  protocol  to  regulate  power  output  for  each  3  -minute  stage  .The  goal  of  the  test  is  to  obtain 
at  least  two  submaximal  heart  rates  between  115  and  150  b/min,  VC^max  is  estimated  by  plotting 
the  linear  increase  in  exercise  heart  rate  associated  with  increases  in  power  output.  Connecting  the 
two  points  defines  the  linear  power  output  and  the  heart  rate  slope.  The  line  defined  by  the  slope  is 
extended  to  maximum  heart  rate,  estimated  by  220  -  age,  and  VC^max  is  estimated  by  dropping  the 
line  down  to  the  power  output  scale  expressed  in  a  metric  of  absolute  V  02  (ml/min). 

The  Singlestage  Exercise  Test  Model  is  both  simpler  to  use  and  slightly  more  accurate  than  the 
Multistage  Model  (Mahar,  Jackson.  &  Ross,  1985).  It  was  initially  popularized  by  the  Astrand- 
Rhyming  nomogram  (Astrand  &  Rodahl,  1970;Astrand  8c  Rodahl,  1986;Astrand  &  Ryhniing, 
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1954).  Research  (Jackson  et  al.,  1990)  showed  that  the  correlation  between  VC^max  estimated 
from  the  single  stage  equation  and  Astrand-Rhyming  nomogram  was  0.99.  Equation  3  is  the 
mathematical  representation  of  the  Astrand-Rhyming  nomogram  (Baumgartner  &Jackson,  1999; 
Jackson  et  al.,  1990;  Ross  &Jackson,  1990). 

Single-Stage  SubmaximalVC^max  Equation  (3) 

V02max=V02SM  xf^-XZkk 

Power  output  can  be  regulated  with  either  the  cycle  ergometer  or  treadmill.  The  V 02  SM  term  in 
Equation  3  is  the  V02  at  submaximal exercise.  Standard  treadmill  or  cycle  ergometer  energy  cost  equa¬ 
tions  estimate  power  output  (ACSM.  1991 ;  Baumgartner  &Jackson,  1999). The  term  SM  HR  is  the 
submaximal  heart  rate  at  VO2  SM,  and  k  is  the  constant  of  61  for  men  and  73  for  women.  The  con- 
stantsrepresentthe  intercept  of  the  heart  rate  and  VC>2  relationship,  that  is,  a  heart  rate  for  a  V02  of  0. 

The  errors  associated  with  submaximal  tests  are  the  measurement  of  exercise  heart  rate,  the  use 
of  the  term  220  -  age  as  an  accurate  representation  of  “true  maximum  heart  rate,”  and  estimating 
submaximal  V02  when  it  is  not  directly  measured.  The  standard  error  of  representing  maximum 
heart  rate  by  220  -  age  is  about  ±  lObeats/min  (ACSM,  1991).  Although  this  does  affect  accura¬ 
cy  somewhat,  it  is  not  a  major  source  of  error.  What  introduces  major  systematic  prediction  errors 
are  conditions  that  affect  the  heart  rate  response  to  exercise.  A  major  source  of  inaccuracy  is  drugs 
such  as  beta  blockers  that  lower  both  exercise  and  maximum  heart  rate.  Submaximal  tests  are  not 
suitable  for  subjects  taking  drugs  that  alter  heart  rate  response  to  exercise.  These  drugs  lower  both 
exercise  and  maximum  heart  rate  thus  producing  an  overestimate  of  true  VC^max.  Another  source 
of  error  is  submaximal  VC*2-  It  is  rarely  measured  during  an  exercise  test.  Standard  equations  are 
available  to  estimate  V02  from  cycle  ergometer  (Astrand  &  Rodahl,  1986;  Astrand  &  Ryhming, 
1954)  and  treadmill  power  output  (ACSM,  1991;Ross  &Jackson.  1990). Research  (Jackson  etal., 
1990;  Ross  &Jackson.  1986)  shows  these  estimates  are  a  major  source  of  error. 

Maximal  Distance  Run  Field  Tests  —  Running  performance  has  been  shown  to  be  related  with 
VC^max.  When  performed  properly,  running  tests  provide  valid  assessments  of  aerobic  fitness, but 
are  not  as  accurate  as  the  regression  equations  derived  for  maximum  treadmill  protocols  in  which 
the  speed  and  elevation  are  strictly  regulated.  The  most  common  run  tests  involvejogging  and/or 
walking  distances  ranging  from  1  to  3  miles,  or  traveling  as  far  as  possible  in  1 2  minutes. 

Dr.  Kenneth  Cooper  was  one  of  the  first  to  popularize  distance  run/walk  tests.  The  goal  of  his 
research  was  to  provide  a  field  test  to  assess  the  aerobic  fitness  of  U.S.  Air  Force  personnel  ( Cooper, 
1968).  His  sample  consisted  of  115  airmen  who  ranged  considerablyin  age  (17  to  52  years),  weight 
( 1 14  to  270  pounds),  and  aerobic  fitness  (28  to  60  ml/kg/min).  The  subjects  first  completed  the 
distance  run/walk  test  of  the  miles  covered  in  12  minutes.  On  the  next  day  the  subjects  completed 
a  maximal  treadmill  test  in  which  VC^max  was  measured  by  indirect  calorimetry.  Cooper  found  a 
very  high  correlation  (r  =  0.90)  between  distance  covered  in  12  minutes  and  measured  VC^max. 
He  published  the  following  regression  equation  with  a  function  to  estimate  distance  traveled  in 
miles  from  VC^max  (ml/kg/min) — 
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Cooper’s  12-minute  Run/Walk  Model 

Distance  (miles )  =  0.3138  =  (0.0278 x  V02max  ml/kg/min) 


(4) 


Cureton  and  Associates  (Cureton,  Sloniger,  O’Bannon,  Black,  8t  McCoormack,  1995)  pub¬ 
lished  a  comprehensive  study  relating  1-mile  run/walk  performance  with  VC^max.  Their  hetero¬ 
geneous  sample  consisted  of  more  than  750  men  and  women  who  ranged  in  age  from  8  to  25  years. 
The  goal  ofthe  studywas  to  develop  a  generalized  regression  equation  that  provided  valid  estimates 
of  aerobic  fitness  for  youth  and  adults  of  both  genders. 

These  researchers  found  that  the  relationship  between  VC^max  and  mile  rudwalk  time  was  not 
linear  and  that  gender,  age,  and  body  mass  index  (BMI)  accounted  for  aerobic  fitness  variance.  The 
multiple  correlation  of  the  generalized  equation  was  0.72,  and  the  standard  error  of  estimate  was  4.8 
ml/kg/min.  Equation  5  gives  the  generalized  1  -mile  run/walk  equation.  The  term  T  is  mile  rudwalk 
time  in  minutes,  and  G  is  gender  coded:  female  =  0,  male  =  l;and  BMI  is  body  mass  index. 

Cuerton’s  1 -mile  Run/Walk  Generalized  Equation  (5) 

V02max  ( ml/kg/min)  = 

108.94  -  (8.41  x  T )+  (0.34  x  T2)  +  (0.21  x  Age  x  G)-  (0.84  x  BMI) 

Distance  run  tests  are  the  most  commonly  used  field  test  of  aerobic  fitness.  Note  that  a  timed 
distance  run  test  is  a  maximal  test.  It  has  all  the  attendant  risks  of  a  maximal  test  with  the  added 
risk  of  being  unsupervised.  Maximal  distance  run  tests  are  suitable  only  for  young  people  in  good 
condition  without  significant  cardiovascular  disease  risk  factors. 

Submaximal  Field  Test — The  Rockport  WalkTest — A  limitation  of  a  laboratory  submaximal  test 
is  the  need  for  a  cycle  ergometer  or  treadmill  to  regulate  power  output.  The  Rockport  WalkTest 
(Kline,  Porcari  &c  Hintermeister,  1987)  provides  a  means  of  estimating  VC^max  from  heart  rate 
response  to  walking  speed.  Track  and  heart-rate  monitoring  equipment  are  needed  to  administer  the 
test.  The  Rockport  Test  involves  walking  as  fast  as  possible  for  lmile,  and  then  measuring  exercise 
heart  rate  immediately  after  the  walk.The  data  needed  to  estimate  VC^max  include  the  following — 

•  Weight  measured  in  pounds 

•  Mile  walk  time 

Exercise  heart  rate  (beats/ min)  measured  immediately  at  the  end  of  the  walk 

•  Age  measured  to  the  last  year 

Gender  coded,  female  =  0,  and  male  =  1. 

Multiple  regression  equations  were  developed  to  estimate  VC^max  (ml/kg/min)  from  these 
variables.  A  general  equation  was  developed  for  men  and  women.  The  Rockport  Walk  Test  is 
shown  in  Equation  6  where  W  is  body  weight  in  pounds,  T  is  mile  run  time  in  minutes,  HR  is 
exercise  heart  rate,  and  G  is  gender,  female  =  0  and  male  =  1 . 
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(6) 


Rockport  WalkTest  (  R=  0.88,  SEE  =  5  ml/kg /min) 

VOimax  (ml /kg /min)  = 

132.85  -  (0.39  x  Age)  -  (0.08  x  W  )-  (3.26x  T)- (0.16  x  HR)  +  (6.32  x  G) 

An  assumption  of  the  Rockport  WalkTest  is  that  exercise  heart  rate  is  at  a  steady  state.  This  is 
best  assured  by  walking  at  a  brisk  steady  pace.  Since  the  Rockport  Test  is  a  submaximal  test,  fac¬ 
tors  that  affect  exercise  heart  rate  reduce  its  accuracy.  For  example,  someone  taking  beta  blocker 
drugs  will  have  a  lower  than  normal  exercise  heart  rate.  This  will  lead  to  a  systematic  overestimate 
of  VC^max.  These  prediction  errors  can  be  quite  large.  Another  important  consideration  is  pacing. 

Nonexercise  Aerobic  Fitness  Tests — VC^max  can  be  estimated  without  testing  subjects  (Jackson 
et  al.,  1990).  Aerobic  fitness  can  be  estimated  with  reasonable  accuracy  from  a  person’s  age,  gender, 
body  composition,  and  self-report  level  of  aerobic  exercise.  A  very  large  database  of  NASA/Johnson 
Space  Center  (Houston,  TX)  employeeswas  used  to  develop  the  nonexercise  models.  VC^max  was 
measured  by  indirect  calorimetry.  Before  being  tested,  the  employees  rated  their  physical  activity 
(Figure  4.1)  during  the  previous  month.  Multiple  regression  was  used  to  estimate  VC^max  from 
exercise  rating  in  combination  with  age,  gender,  and  a  body  composition  parameter  consisting  of 
either  percent  body  fat  or  body  mass  index. 

The  original  research  (Jackson  et  al.,  1 990)  provided  two  equations  that  could  be  used  for  men 
and  women.  The  difference  in  the  equations  was  the  body  composition  variable. The  most  accurate 
equation  (R  =  0.81,  SEE  =  5.3  ml/kg/ min)  used  skinfold-determined  percent  body  fat  (Jackson  Sc 
Pollock,  1978;  Jackson,  Pollock,  &Ward,  1980).  The  equation  that  used  BMI  was  slightly  less 
accurate  ( R  =  0.78,  SEE  =  5.7  ml/kg/ min),  but  more  feasible  for  mass  testing.  A  limitation  of  these 
early  equations  was  that  the  women’s  sample  size  was  much  smaller  than  that  of  the  men’s. 
Additional  studies  were  published  from  this  database  (Jackson  et  al.,  1995;Jackson  et  al.,  1996b) 
in  which  the  goal  was  to  examine  the  influence  of  aging  on  VC^max.  A  much  larger  sample  of 
women  was  used,  providing  a  means  of  developing  gender  specific  equations.  Following  are  the 
male  and  female  equations  (Equations  7  to  10),  where  AR  is  activity  code  (Oto  7),  %  fat  is  percent 
body  fat,  and  BMI  is  body  mass  index.  The  BMI  equations  are  published  in  another  source 
(Baumgartner  &Jackson,  1999). 

Percent  Fat  Non-exercise  Men’s  Equation  (  R=  0.79,  SEE  =4.9  ml/kg/min)  (7) 

V02max  ( ml/kg/min )  = 

47.820  -  (0.259  x  Age)  -  (0.216  x  %fat)+  (3.275  x  AR)  -  (0.082  x  %fat  x  AR ) 

Percent  Fat  Non-exercise  Women’s  Equation  (  /?  =  0.85,  SEE  =4.4  ml/kg/min )  (8) 

V02max  (ml/kg/min)  = 

45.628  -  (0.265  x  Age)  -  (0.309x  %fa t)  +  (2.175  x  AR)  -  (0.044  x  %fat  x  AR) 

BMI  Non-exercise  Men’s  Equation  (  R  =  0.74,  SEE  =5.4  ml/kg/min)  (9) 

VC>2max  (ml/kg/min)  = 

55.688  -  (0.362  x  Age)  -  (0.331x  BMI)  +  (4.310  x  AR)  -  (0.096  x  BMI  x  AR) 
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BMI  Non-exercise  Women’s  Equation  (  R=  0.82,  SEE  =  4.7  ml/kg/min )  (10) 

V 0%max  ( ml/kg/min )  = 

44.310  -  (0.326  x  Age)  -  (0.227  x  BMI)  +  (4.471  x  AR )  -  (0.135  x  BMI  x  AR ) 


Code  for  Physical  Activity 

Use  the  appropriate  number  (0  to  7)  which  best  describes  your  general  ACTIVITY 
LEVEL  for  the  PREVIOUS  MONTH. 

Do  not  participate  regularly  in  programmed  recreation  sport  or  heavy  physical  activity. 

0  -  Avoid  walking  or  exertion,  e.g.,  always  use  elevator,  drive  whenever  possible 
instead  of  walking. 

1  -  Walk  for  pleasure,  routinely  use  stairs,  occasionally  exercise  sufficiently  to  cause 
heavy  breathing  or  perspiration. 


2  -  10  to  60  minutes  per  week. 

3  -  Over  one  hour  per  week. 

Participate  regularly  in  heavy  physical  exercise  such  as  running  orjogging,  swimming,  cycling, 
rowing,  skipping  rope,  running  inplace,  or  engaging  in  vigorous  aerobic  activity  type  exercise  such 
as  tennis,  basketball  or  handball. 

4  -  Run  less  than  one  mile  per  week  or  spend  less  than  30  minutes  per  week  in  com¬ 

parable  physical  activity 

5  -  Run  1  to  5  miles  per  week  or  spend  30  to  60  minutes  per  week  in  comparable 

physical  activity 

6  -  Run  5  to  10  miles  per  week  or  spend  1  to  3  hours  per  week  in  comparable  physi¬ 

cal  activity 

7  -  Run  over  10  miles  per  week  or  spend  over  3  hours  per  week  in  comparable  phys¬ 

ical  activity. 


Figure  4. 1  Scale  for  rating  level  of  physical  activity.  The  directions  are  to  select  one  value  that  best  repre¬ 
sents  the  level  of  physical  activity  for  the  previous  month.  Thescale  was  developed  for  use  in  theCardio-pul- 
monary  Laboratory,  NASA/Johnson  Space  Center,  Houston,  Texas.  Source: Baumgartner  and  Jackson  (1999). 

The  nonexercise  tests  are  especially  feasible  for  mass  testing.  Since  heart  rate  is  not  a  factor  of  the 
nonexercise  test,  the  nonexercise  equations  are  valid  for  subjects  taking  heart-rate  altering  medication 
(Jackson  et  al.,  1990).  With  the  ease  of  test  administration,  one  may  question  the  accuracy  of  them. 
It  has  been  demonstrated  that  the  nonexercise  tests  were  more  accurate  than  the  singlestage  submax- 
imal  test  (Jackson  et  al.,  1990),  but  less  accurate  than  maximal  treadmill  performance  (Baumgartner 
&Jackson,  1999).  The  obvious  limitation  of  the  nonexercise  approach  is  the  use  of  subjective  rating 
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for  their  level  of  exercise.The  nonexercise  models  are  especially  useful  for  estimating  the  VC^max  of 
large  numbers  of  subjects.  The  BMI  models  provide  a  means  of  obtaining  VOjmax  by  self-report. 
This  provides  a  means  of  estimating  the  VC^max  of  large  samples  by  questionnaire. 

Evaluation  Standards  —  Table4.1  lists  the  age  and  gender  aerobic  fitness  standards  recommended 
by  the  American  College  of  Sports  Medicine  (Gettman,  1993).  These  standards  reflect  the  well- 
established  gender  difference  in  VC^max  and  the  decline  in  VC^max  with  age  (Buskirk  & 
Hodgson,  1987;Jackson  et  al.,  1995;Jackson  et  al.,  1996b). 


Table  4.1  American  College  of  Sports  Medicine  standards  forV02tnax  ml/kg/min* 


Age  in  Years 

z z 

Standard 

20-29 

30-39 

46-49 

50-59 

>60 

Men 

Excellent 

>52 

249 

247 

>43 

241 

Good 

49-51 

4H8 

44-46 

4M2 

38-40 

Average 

42— 4& 

39-45 

37-43 

33-39 

31-37 

Fair 

39-41 

36-38 

34-36 

30-32 

28-30 

538 

535 

<33 

529 

S27 

Excellent 

243 

>40 

238 

234 

>34 

Good 

4W2 

37-39 

35-37 

31-33 

31-33 

Average 

33-39 

31-36 

29-34 

25-30 

25-30 

Fair 

30-32 

28-30 

2628 

22-24 

22-24 

Poor 

529 

527 

525 

521 

<21 

*  from  Gettrnan(1993).  Reprinted,  by  permission,  from  L,  R.  Gettman,  1993,  American  College  of  Sports  Medicine:  Resource  manualfor  guidelinesfor  exercisetesting  and 


prescription^1"  ed.,  pp,  229-246).  Philadelphia:Lea  &  Febiger 


Although  the  norms  in  Table  4. 1  give  aerobic  fitness  standards  for  adults,  the  levels  are  not  suit¬ 
able  for  adult  health  promotion.  The  research  published  by  Blair  and  associates  (1989)  gives  the 
first  scientific  data  defining  the  aerobic  fitness  needed  for  health.  It  showed  that  the  aerobic  fitness 
health  promotion  threshold  was  32  ml/kg/min  for  women  and  35  ml/kg/min  for  men  (Figure  4.2). 
The  mortality  rate  of  men  and  women  with  the  lowest  level  of  aerobic  fitness  was  four  times  high¬ 
er  than  the  rate  of  men  and  women  who  exceeded  these  levels.  Aerobic  fitness  declines  with  age, 
and  the  35  and  32  levels  were  for  men  and  women  at  age  45.  Table  4.2  lists  health  promotion  aer¬ 
obic  fitness  standards  adjusted  for  the  age-related  decline  in  aerobic  fitness.  Also  listed  is  the  value 
needed  to  have  an  aerobic  power  of  35  or  32  ml/kg/min  at  age  45  years,  assuming  one  maintains 
his  or  her  current  level  of  exercise  and  percent  body  fat.  Research  shows  that  changing  exercise 
habits  and  percent  body  fat  affects  the  rate  that  aerobic  fitness  changes  with  age  (Jackson  et  al., 
1995;Jackson  et  al.,  1996b). 


Body  Composition 

Suitable  levels  of  body  composition  are  important  for  health  and  the  capacity  to  perform  work 
tasks  that  require  a  person  to  move  his  or  her  body  weight.  Following  are  the  methods  used  to  eval¬ 
uate  body  composition. 
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Aerobic  Fitness 

Figure  4.2  Curves  of  the  relationship  between  aerobic  fitness  and  health  show  that  the  curves  level  of  fat 
32ml/kg/min  for  women  and  35  ml/kg/m  in  for  men.  These  values  define  the  threshold  level  of  fitness  need¬ 
ed  for  health  promotion  (i.e.,  reduced  mortality)  for  45-year-old  men  and  women.  Journal  of  American 
Medical  Association,  1989,262,  pp.  2395-2401.  "Copyrighted  1989,  American  Medical  Association” 


Table  4.2  Age-adjusted  adult  aerobic  fitness  standards  for  health  promotion*  above  V0?max  ml/kg/m  in 


Age  Group 

Men 

Women 

45  and  under 

35 

32 

50 

34 

31 

55 

32 

29 

60 

31 

28 

65  and  Over 

30 

27 

*  Standards  developed  from  data  (Jackson  etal.,  1995,  Jackson  etal ,  1 996)  and  personal  commnication  with  S  Blair,  September  30,1993  From  Baumgartner  and 
Jackson  (1999) 


Body  Density  and  Percent  Body  Fat — In  simple  terms,  body  weight  consists  of  fat  weight  and  fat- 
free  weight.’  Percent  body  fat  is  simply  the  proportion  of  total  weight  that  is  fat  weight.  Percent 
body  fat  is  measured  from  body  density,  the  ratio  of  body  weight,  and  body  volume.The  hydrostat¬ 
ic  or  underwater  weighing  method  is  the  most  common  laboratory  method  used  to  measure  body 
composition.  Numerous  laboratories  at  universities  and  medical  centers  have  the  equipment  for 
underwater  weighing  determinations.  The  measurement  objective  of  hydrostatic  weighing  is  to 
measure  body  volume,  which  is  then  used  with  body  weight  to  calculate  body  density.  Percent  fat 
is  calculated  from  body  density.  A  newer,  less  common  method  for  measuring  body  volume  is  with 
a  “body  box”  or  body  plethysmograph. 

The  values  needed  to  calculate  body  volume  (BV)  are  body  weight  on  land  (Wt),body  weight 
in  water  ( Ww),  the  density  of  water  (Dw),  and  the  body’s  air  component  (Ba)  consisting  of  resid- 
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ual  volume  +  100  ml.  The  100  ml  value  is  an  estimate  of  air  in  the  gastrointestinal  tract.  Equation 
llgives  the  equation  used  to  measure  body  volume.  Body  density  (BD)  is  the  ratio  of  body  weight 
on  land  and  body  volume  (Equation  12). 


Body  Volume 


BV  = 


Wt-Ww 

Dw 


—  Ba 


(ID 


Body  Density  (12) 

BD  =  §£ 


Variation  in  body  density  can  be  caused  by  air,  fat  weight,  and  fat- free  weight.  The  density  of 
air  is  zero,  and  the  density  of  fat  weight  tissue  is  about  0.90  g/cc.  The  density  of  fat-free  weight 
varies  from  about  l.Og/cc  to  as  high  as3.0g/cc.  Fat-free  weight  consists  of  muscle,  blood,  bone, 
and  organs.  The  two-component  models  for  computing  percent  body  fat  from  body  density  were 
based  on  the  assumption  that  the  density  of  fat  tissue  was  0.90  g/cc  and  fat-free  weight  was  1.10 
g/cc.  Researchers  are  starting  to  question  this  assumption  when  computing  the  percent  body  fat  of 
children,  the  elderly,  and  ethnic  groups. This  has  led  to  the  development  of  multicomponent  mod¬ 
els  (Heymsfield.  1996;Lohman,  1992). 

The  first  two-component  equations  developed  for  converting  body  density  to  percent  body  fat 
were  published  by  Siri,  1961  and  Brozek  et  al.  (Brozek,  Grande.  &c  Anderson,  1963). The  equa¬ 
tions  provide  nearly  identical  percent  fat  values  throughout  the  human  range  of  body  fatness. The 


equations  are  as  follows — 

Siri  Percent  Body  Fat 
%fat=§ %  -450 

(13) 

Brozek  Percent  Body  Fat 
%fat  =  %  -414 

(14) 

There  is  growing  evidence  that  the  Siri  and  Brozek  equations  may  not  be  accurate  when  applied 
to  some  ethnic  groups.  The  ethnic  differences  in  body  density  are  believed  to  be  because  of  fat-free 
weight  differences  associated  with  bone  mineral  content  (Sinning,  1996;  VanLoan,  1996).  Cross- 
sectional  data  (Vickery  et  al.,  1988)  show  that  the  mean  body  density  of  African-American  men 
(1.075g/cc)  was  significantly  higher  than  white  men  (1.065  g/cc).  They  also  found  that  the  mean 
of  the  sum  of  seven  skinfolds  was  not  different,  suggesting  that  the  difference  in  the  relationship 
of  skinfolds  to  body  density  for  white  men  versus  African-American  men  was  due  to  variability  in 
the  composition  of  fat-free  weight. 

Sinning  ( 1996)  suggests  that  the  high-bone  mineral  content  of  African-Americans  results  in  a 
fat-free  density  higher  than  the  value  of  1.  lg/cm  assumed  by  the  Brozek  and  Siri  equations  and 
recommends  that  an  equation  published  by  Schutte  et  al.  (Schutte,  1984)  be  used  to  convert  body 
density  to  percent  for  African-American  men.  The  equation  is  as  f  o  1 1  o  w  s 
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(15) 


Schutte  Equation  for  African-Americans 
%fat  =  -  392.8 

The  Siri,  Brozek,  and  Schutte  methods  of  estimating  percent  body  fat  from  body  density  have 
lost  their  status  as  the  “gold  standard”  for  assessing  body  composition.  Each  is  based  on  the  two- 
component  model  that  assumes  that  the  density  of  fat  tissue  is  0.9  g/cc  and  the  body’s  average  den¬ 
sity  of  fat-free  weight  is  1.10  g/cc.  Although  this  is  likely  true  for  adults  between  the  ages  of  about 
20  and  50  years,  the  two-component  model  has  serious  limitations  when  measuring  the  body  com¬ 
position  of  elderly,  children,  and  ethnic  groups  (Lohman,  1992).  Variation  in  total  body  water  and 
mineral  content  of  these  extreme  groups  varies  from  the  values  of  the  20  to  50-year-old  subjects 
and  this  affects  the  density  of  fat-free  weight. 

During  childhood  and  the  elderly  years,  the  body  is  changing  more  dramatically.  Changes  in 
body  water  and  bone  mineral  content  alter  the  density  of  the  fat-free  component.  As  these  values 
increase  over  reference  values,  there  is  a  linear  increase  in  percent  body  fat  errors  obtained  with  the 
two-component  method.  Lohman  (1992)  provides  an  excellent  discussion  on  the  effect  of  bone 
mineral  differences  on  the  accuracy  of  percent  body  fat  determinations.  The  multicomponent 
model  is  a  method  of  minimizing  these  errors.The  multicomponent  model  not  only  includes  body 
density  but  also  water  (w)  and  mineral  (m)  content.  A  multicomponent  equation  (Lohman,  1992) 
that  can  be  used  for  children  or  adults  of  any  age  and  any  ethnicity  is  as  follows — 

Multicomponent  Fat  Model  (16) 

%Fat  =^r  -  (0.727  x  w)-  (1.146  x  to)  -2.053 

As  the  percent  body  fat  equation  shows,  body  density  is  a  primary  element  for  measuring  per¬ 
cent  body  fat  with  either  the  two-component  or  multicomponent  models.  The  accurate  measure¬ 
ment  of  percent  fat  depends  on  the  accurate  measurement  of  body  density,  and  the  underwater 
weighing  method  is  the  “gold  standard. ’’Although  many  believe  that  measuring  underwater  weight 
is  the  biggest  source  of  inaccuracy,  this  is  not  the  case.  The  air  component  (i.e.,  residual  lung  volume 
and  gastrointestinal  air)  is  the  most  error-prone  variable.  Table  4.3  provides  the  potential  problems 
associated  with  realistic  measurement  errors  of  the  variables  used  to  measure  underwater  percent 
body  fat.  Provided  is  the  degree  that  actual  percent  body  fat  would  vary  for  three  different  error  con¬ 
ditions.  Typically,  the  measurement  error  of  underwater  weight  is  less  than  0. 1  kg.  Body  weight  and 
water  temperature  can  be  measured  very  accurately.  Residual  lung  volume  can  be  difficult  to  meas¬ 
ure  and  the  air  in  the  gastrointestinal  tract  is  estimated  at  100  ml  (0.1  L).  Air  component  errors  of 
+0.1  L  translate  to  percent  body  errors  of  ±  0.7%  fat,  but  air  component  errors  of  ±  1L  lead  to  huge 
percent  body  fat  errors,  ±  8.0%  fat.  Estimating  residual  volume  from  age,  height,  and  gender  yields 
air  component  errors  in  this  magnitude  (Morrow  Jackson,  Bradley,  &c  Hartung,  1986).  If  the  under¬ 
water  weighing  method  is  to  be  used,  the  air  component  must  be  measured  accurately. 

Height  and  Weight  Assessment  of  Body  Composition — Because  of  the  need  for  highly  trained 
technicians  and  expensive  laboratory  equipment,  hydrostatically  determined  body  composition  is 
rarely  used  in  field  settings.  A  common  method  is  to  use  a  weight-height  ratio.  Body  mass  index 
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Table  4.3  Effects  of  component  errors  on  underwater  determined  by  Siri  percent  body  fat 


Underwater  Weight 

Actual  Value 

Error  Conditions 

Varible 

t 

2 

3 

Air  Component  (L) 

1.20 

1.30 

1.60 

2.20 

%  Fat 

15.00% 

14.30% 

12.20% 

8.00% 

1  Underwater  Weight  (kg) 

3.36 

3.38 

3.41 

3.46 

1  %  Fat 

15.00% 

14.90% 

14.60% 

14.30% 

Body  Weight  (kg) 

70.00 

70.10 

70.50 

71.00 

%  Fat 

15.00% 

15.10% 

15.30% 

15.50% 

Water  Temp  (C') 

36.00 

36.10 

36.50 

37.00 

%  Fat 

15.00% 

15.10% 

15.10% 

15.20% 

*  Constructed  from  published  data  of  Going  1996;  Pollock  andWiimore  1984.  From  Baumgartner  and  Jackson  (1999) 


(BMI)  is  the  weight-height  ratio  often  used  for  large-scale  public  health  studies.  BMI  (Equation 
17)is  computed  from  weight  in  kilograms  and  height  in  meters. 


Body  Mass  Index  (BMI) 


BMI  = 


W  e  i  g  h  t 

Height  X  Height 


(17) 


The  BMI  standards  used  to  define  overweight  for  the  Healthy  People  2000  Public  Health  program 
are  27.8  for  men  and  27.2  for  women  (USDHHS,  1990). The  World  Health  Organization  (WHO) 
uses  BMI  to  evaluate  degree  of  obesity  (Bouchard  &Blair,  1999  ).  The  WHO  standards  are — 

BMI  25.0  to  29.9  —  Overweight  (pre-obesity) 

*  BMI  30.0  to  34.9  —  Obesity  Class  I 

*  BMI  35.0  to  39.9  — Obesity  Class  II 

*  BMI  >  40.0 — Obesity  Class  III 

Another  method  of  interpreting  BMI  is  by  its  relationship  with  hydrostatically  measured  per¬ 
cent  body  fat.  Medical  researchers  (Gallagher,  1996 (have  published  a  generalized  equation  for  esti¬ 
mating  percent  body  fat  from  BMI.  They  studied  more  than  700  African-American  and  white  men 
and  women  who  ranged  in  age  from  20  to  94  years.  The  multicomponent  model  was  used  to  meas¬ 
ure  percent  body  fat.  They  found  that  age,  BMI,  and  gender  (Female  =  0,  Male  =l)were  signifi¬ 
cantly  related  to  percent  fat,  but  ethnicity  was  not.  The  equation  for  estimating  percent  body  fat 
from  these  variables  is  as  follows — 

Gallagher  et  al.  Equation  (f?=0.819,  S  EE  =5.68%/af)  (18) 

%fat  =  (1.45  x  BMI )  +  (0.12  x  Age)  —  (11.61  x  Gender )  —  10.02 

Skinfold  Equations  —  Several  researchers  have  published  regression  equations  with  functions  to 
predict  hydrostatically  measured  body  density  from  various  combinations  of  skinfold  measure¬ 
ments.  Early  researchers  developed  equations  for  homogeneous  populations.  These  were  termed 


Human  Systems  IAC  SOAR,  2000 


113 


“population-specific”equations.  More  than  100 population-specific  equations  appear  in  the  litera¬ 
ture.  The  second  trend  was  to  use  what  is  termed  ‘'generalized equations, ’’which  are  equations  that 
can  be  validly  used  with  heterogeneous  samples.  Population-specific  equations  were  developed  on 
small,  homogeneous  samples,  and  their  application  is  limited  to  that  sample. 

The  generalized  equations  were  developed  on  large  heterogeneous  samples  using  models  that 
accounted  for  the  nonlinear  relationship  between  skinfold  fat  and  body  density.  Age  was  an  impor¬ 
tant  variable  for  generalized  equations  (Durnin  8c  Rahaman,  1967;  Jackson  8c  Pollock,  1978; 
Jackson  et  al.,  1980).The  main  advantage  of  the  generalized  approach  is  that  one  equation  replaces 
several  without  a  loss  in  prediction  accuracy.  A  detailed  discussion  of  population-specific  and  gen¬ 
eralized  equations  can  be  found  in  other  sources  (Cureton,  1984;Jackson,  1984;  Lohman,  1982). 

A  factor  analysis  (Jackson  8c  Pollock,  1976)  of  skinfold  and  body  circumference  variables 
showed  that  skinfolds  measured  the  same  general  body  composition  construct.  This  suggested  that 
the  sum  of  several  skinfolds  provided  the  most  accurate  estimate  of  the  body  fat  construct.  It  was 
discovered  that  equations  that  use  the  sum  of  three  skinfolds  were  highly  correlated  (r  >  0.97)  with 
the  sum  of  seven  skinfolds  (Jackson  8c  Pollock,  1978;  Jackson  et  al.,  1980). This  showed  that  the 
sum  of  three  skinfolds  could  be  used  without  the  loss  of  accuracy,  and  the  sum  of  three  equations 
has  become  standard.  Multiple  regression  models  were  used  to  develop  generalized  skinfold  equa¬ 
tions  for  men  (Jackson  &Pollock,  1978)  and  women  (Jackson  et  al.,  1980)using  the  sum  of  seven 
and  three  skinfold  sites.  The  Jackson-Pollock  databases  were  also  used  to  develop  the  prediction 
equations  used  for  the  YMCA  adult  fitness  test  (Golding  et  al.,  1989).  Equations  1 9  and  20  show 
the  Jackson-Pollock  sum  of  three  skinfold  equations  — 

Females:  Triceps,  Suprailium  and  Thighs  (R  =  0.84,  SEE  =0.009)  (19) 

BD  =  1.099421  -  (0.0009928 x  C3)  +  (0.000000023  x  S32)  -  (0.0001382 x  Age) 

Males:  Chest,  Abdomen  and  Thigh  (  A=0.91,  SEE  =0.008)  (20) 

BD  =  1.10938-  (0.0008267  x  C3)  +  (0.0000016  x  S32)  -  (0.0002574  x  Age) 

These  equations  have  been  used  extensively  to  estimate  the  body  composition  of  men  and 
women.  The  equations  provide  accurate  models  for  estimating  body  density.  The  limitation  of  the 
generalized  equations  is  for  estimating  percent  body  fat.  Using  the  two-component  percent  fat 
equations  of  Sirl.  Brozek,  and  Schutte  will  only  provide  accurate  percent  body  fat  estimates  when 
the  subject’s  fat-free  weight  averages  a  density  of  l.lOg/cc.This  limits  their  use  to  subjects  between 
the  ages  of  20  and  about  50  years. 

Body  Circumferences  Prediction  Models — Body  circumferences  have  also  been  used  to  assess  body 
composition.  The  research  method  used  was  to  estimate  body  density  from  combinations  of  body 
circumference  measurements.  Body  circumferences  are  correlated  with  hydrostatically  determined 
body  density. Tran  and  associates  (Tran  8tWeltman,  1989;Tran,  Weltman,  8c  Seip,  1988)published 
generalized  equations  for  estimating  hydrostatically  determined  body  density  from  various  combi¬ 
nations  of  circumference  measurements.  In  1981,  the  U.S.  Navy  changed  from  using  height  and 
weight  standards  to  percent  body  fat  estimated  from  body  Circumferences  (Hodgdon  8c  Beckett, 
1984a;  Hodgdon  8c  Beckett,  198413). The  variables  used  for  the  U.S.  Navy  equations  are  height, 
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abdomen  circumference, hip  circumference,  and  neck  circumference. The  U.S.  Army  also  uses  body 
circumference  to  estimate  percent  body  fat.  Table  4.4  gives  the  Military  body  composition  equations. 


Table4.4  U.S.  Military  body  composition  equations 

Army  (Vogel  et  al.,  1988) 

Men  R  =  0.82,  SEE  =  4.02 

Percent  fat  =  76.5  x  Logio(abdomen  11*  —  neck)  -68.7  x  Logio(height)  +46.9 
Women/?  =0.82,  SEE  =3.60 

Percent  fat  =  105.3 x  Logioweight  —  0.200  x  wrist  —0.533  x  neck— 

1.574x  forearm  +0.173  x  hip  —  0.515  x  height  —  35.6 

Navy  (Hodgdon  and  Beckett,  1984a,  b)  and  Air  Force 
Men/?  =0.90,  SEE  =  3.52 

Density  =  —0.191  x  Logio(abdomen  II  -neck)-f  0.155  x  Logio(height)  +  1.032 
Percent  fat  =  100  x  [(  - )—  -4.51 

Women  R  =  0.85,  SEE  =  3.72 

Density  =  — 0.350  x  Logio(abdomen  I  f  -  neck)-\-  0.155  x  Logw{height)  +  1.032 
Percent  fat  =  100  x  [( - }  -4.51 

Marine  Corps  ( Wright  et  al.,  1980,1981) 

Men  R  =0.81,  SEE  =3.67 

Percent  fat  =0.740  x  abdomen  II  -  1.249 x  neck  +40.985 
Women/?  =0.73,  SEE  =4.11 

Percent  fat  =  1.051x  biceps  —  1.522  x  fo rearm  —  0.879  x  neck+ 

0.326  x  abdomen  77+0.597  x  thigh  +0.707 


NOTE:  Circumference  measurementsand  height  are  in  centimeters.  SEE,  standard  error  of  the  estimate. 

Abdomen  if  is  the  circumference,  measuredintransverse  plane,  at  the  level  of  the  umbilicus, 
f  Abdomen  I  isthe  "naturalwaist"  and  is  defined  as  the  smallest  circumference,  measured  in  the  transverse  plane,  obtained  between  the  lower  margin  of  the  xiphoid 
process  and  the  umbilicus. 

SOURCE:  Adapted  from  Hodgdon  (1992) 


W aist-Hip  Ratio — Medical  research  has  shown  that  people  with  central,  visceral  types  of  obesity  are 
particularly  at  risk  for  developing  cardiovascular  disease,  stroke,  and  noninsulin-dependent  diabetes 
mellitus.The  field  test  used  to  measure  central  visceral  obesity  is  waist-hip  ratio  (WHR)  (Equation 
21).  Efforts  are  now  underway  to  obtain  better  estimates  of  central  visceral  fat  with  imaging  meth¬ 
ods  such  as  computed  axial  tomography  (CT  scans)  (Wilmore  et  al.,  1999).  The  development  of 
central,  visceral  obesity  is  believed  to  be  caused  by  an  alteration  in  the  body's  metabolic  system. 
Several  of  these  endocrine  abnormalities  are  associated  with  insulin  resistance  that  is  believed  to  be 
the  cause  of  the  increased  disease  risk.  Although  the  BMI  has  been  used  to  identify  overweight  indi¬ 
viduals,  Bray  (Bray,  1993)proposed  that  both  BMI  and  WHR  be  used  to  define  health  risk. 
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Waist-Hip  Ratio  (WHR) 


WHR  = 


Waist~C 

Hip-C 


(21) 


Dual  Energy  X-Ray  Absorptiometry — Dual  energy  x-ray  absorptiometry  (DX  A)  is  a  method  that 
evolved  from  the  widespread  use  of  single-  and  dual-photon  absorptiometry.  The  development  of 
computer  technology  enhanced  the  application  of  DXA  technology  to  the  measurement  of  body 
composition.  Lohman  (Lohman,  1996)  reports  that  DXA  can  be  used  to  measure  total  body  and 
regional  body  composition,  including  the  estimation  of  bone  mineral  content,  lean  tissue  mass,  fat- 
free  mass  and  fat  mass.  Many  believe  that  DXA  may  become  a  reference  method  of  estimating 
human  body  composition  and  even  replace  underwater  weighing.  The  DXA  standard  errors  typi¬ 
cally  range  from  2.5  to  3.5  percent  when  estimating  percent  fat  from  body  density  using  the  mul¬ 
ticomponent  method  (Lohman,  1996).  With  improved  computer  technology,  test  methods  and 
lower  costs,  DXA  will  likely  grow  in  popularity.  A  major  limitation  of  the  hydrostatic  method  is 
that  it  can  be  difficult  and  even  impossible  to  underwater-weigh  persons  who  have  a  fear  of  water. 

Bioelectrical  Impedance  Method  —  Bioelectrical  impedance  analysis  (BIA)  is  based  on  the  princi¬ 
ple  that  the  electrical  resistance  of  the  body  to  a  mild  electric  current  is  related  to  total  body  water. 
Total  body  water  and  fat-free  weight  are  highly  related.  The  BIA  method  is  simple  and  requires 
only  the  placement  of  four  electrodes,  two  on  the  subject’s  ankle  and  two  on  the  wrist.  An  electri¬ 
cal  current  is  transmitted  into  the  subject,  and  the  resistance  in  ohms  is  read  directly  into  a  micro¬ 
computer  that  calculates  body  composition. 

In  the  early  stages  of  BIA  technology,  the  accuracy  of  BIA  was  a  major  concern.  One  study 
showed  that  this  method  was  no  more  accurate  than  BMI  (Jackson,  Pollock,  Graves,  &  Mahar, 
1988).  Recent  research  (Lohman,  1992)  showed  that  with  suitable  equations,  BIA  estimates  ofper- 
cent  body  fat  are  similar  in  accuracy  to  skinfold  estimates,  except  for  the  obese  and  very  lean. 
Equations  developed  on  the  general  population  tend  to  underestimate  percent  body  fat  of  the  obese 
and  overestimate  the  percent  body  fat  of  very  lean  subjects,  showing  that  more  research  is  needed 
to  develop  generalized  BIA  equations. 

Percent  Body  Fat  Standards  —  Different  percent  body-fat  standards  are  needed  for  men  and 
women.  Women  not  only  have  a  higher  percentage  of  their  weight  in  storage  fat  measured  with  the 
caliper  but  also  in  essential  fat  consisting  of  lipids  of  the  bone  marrow,  central  nervous  system, 
mammary  glands,  and  other  organs.  Because  of  this  additional  storage  fat,  the  percent  body  fat  of 
women  tends  to  be  about  7  to  8  percent  higher  than  that  of  men’s  (Jackson  &.  Pollock,  1978; 
Jackson  et  al.,  1980).  Tables  4.5  and  4.6  give  the  percent  body  fat  standards  recommended  by  the 
American  College  of  Sports  Medicine  (Gettman,  1993). 

Muscular  Strength 

Muscular  strength  is  the  maximum  amount  of  force  that  a  muscle  group  can  exert.  Muscle  con¬ 
tractions  can  be  either  dynamic  or  static.  Static  contractions  do  not  involve  movement  and  are 
called  isometric.  Dynamic  contractions  involve  movement,  either  concentric,  the  muscle  shortens, 
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Table  4.5  Percent  body  fat  standard  for  women  contrasted  by  age  group’ 


Standard 

<30 

3D-39 

Age  Group  (Women) 

40-49 

>50  | 

High 

>32% 

>33% 

>34% 

>35% 

Moderately  High 

2632 

27-33 

28-34 

29-35 

Optimal  Range 

15-25 

16-26 

17-27 

18-28 

Low 

12-14 

13-15 

14-16 

15-17 

Very  Low 

<11% 

212% 

113% 

214% 

*  Reprinted,  by  permission,  from  L.  R.  Gettman,  1993,  American  College  of  Sports  Medicine:  Resource  manual  for  guidelines  for  exercise  testing  and  prescription  (2nd  ed.. 
pp.  229-246).  Philadelphia:  Lea  &  Febiger. 


Table  4.6  Percent  body  fat  standards  for  men  contrasted  by  age  group * 


Standard 

<30 

30-39 

Age  Group  (Women) 

40-49 

>50 

High 

>28% 

>29% 

>30% 

>31% 

Moderately  High 

22-28 

23-29 

24-30 

25-31 

Optimal  Range 

11-21 

12-22 

13-23 

14-24 

LOW 

6-10 

7-11 

8-12 

9-13 

Very  Low 

<5% 

26% 

27% 

<8% 

or  eccentric,  the  muscle  lengthens.  The  dynamic  forms  include  isotonic  and  isokinetic.  Isotonic 
involves  moving  a  weight  against  gravity.  Lifting  the  weight  uses  a  concentric  contraction  while 
lowering  the  weight  uses  an  eccentric  contraction.  Isokinetic  involves  muscle  contractions  at  a  fixed 
speed.  Strength  testing  may  be  either  open  or  closed  kinetic  chain.  An  open  kinetic  chain  is  when 
the  end  of  the  limb  segment  is  free  in  space  while  a  closed  kinetic  chain  is  when  the  end  segment 
or  joint  meet  with  external  resistance  that  prevents  or  restrains  free  motion.  In  a  closed  kinetic 
chain,  movement  at  one  joint  produces  movement  at  the  other  joints  in  the  chain  or  system.  In  an 
open  kinetic  setting,  the  distal  limb  segment  can  move  freely  (Baumgartner &Jackson,  1999). Table 
4.7  provides  an  overview  of  the  strengths  and  weaknesses  of  the  strength  testing  method 
(Baumgartner  &Jackson,  1999). 

Types  of  Strength  Tests  —  Isometric  strength  testing  has  historically  been  popular.  It  is  the  maxi¬ 
mum  force  that  a  muscle  group  can  exert  without  movement.  Mechanical  devices  such  as  ten¬ 
siometers  and  spring  dynamometers  were  the  units  first  used  to  measure  the  force  applied  during 
an  isometric  contraction.  These  mechanical  units  have  been  replaced  with  electronic  load  cells. 
Professional  standards  for  equipment  and  isometric  test  methods  are  published  (Chaffin,  1975; 
NIOSH,  1977).  Isometric  tests  can  measure  specific  muscle  groups  or  combinations  of  muscle 
groups.  Isometric  arm,  shoulder,  torso,  and  leg  strength  have  been  used  extensively  for  preemploy¬ 
ment  decisions  (Baumgartner  &Jackson,  1999;Jackson,  1994;  NIOSH,  1977). 
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Table  4.7  A  comparison  of  strength  testing  methods* 


Method 

Strengths 

Weaknesses 

isometric 

1.  Moderately  inexpensive 

2.  Can  be  used  to  test  a  variety  of  different  muscle  groups 

3.  Closed  kinetic  chain 

4.  Strong  research  base  for  preemployment  testing 

5.  Normative  data  are  available 

6.  Takes  very  tittle  time  to  test  a  subject 

7.  Easy  to  learn  how  to  administer  the  tests 

1 .  Only  one  joint  angle  is  tested 

Z  Does  not  provide  a  torque  strength  curve 

3.  Cannot  measure  dynamic  contractions 

llsotonic 

1 .  Very  inexpensive 

2.  Many  different  types  of  equipment  can  be  used 

3.  Closed  kinetic  chain 

4.  Tests  often  duplicate  strength  development  program 

5.  Takes  very  little  time  to  test  a  subject 

6.  Easy  to  learn  how  to  administer  the  tests 

1 .  Never  measure  "true  maximum" 

2.  Cannot  obtain  a  strength  curve 

3.  Risk  of  injury  if  free  weights  are  used 

4.  Different  types  of  equipment  affectthe  score, 
need  equipment  specific  norms 

5.  Can  be  difficult  to  find  1  -RM 

Isokinetic 

1 .  Can  obtain  strength  curves  for  many  different  speeds 

2.  Can  obtain  both  eccentric  and  concentric  contractions 

3.  Data  can  be  expressed  in  many  different  ways 

4.  Valuable  for  rehabilitation  process 

1 .  Very  expensive  equipment 

2.  Not  closed  kinetic  chain 

3.  Velocity  of  movement  affects  torque  output, 
need  norms  for  various  speeds 

*  From  Baumgartner  and  Jackson  (1 999) 


Isotonic  strength  is  measured  by  determining  the  maximal  force  that  a  muscle  group  can  exert 
with  a  single  contraction.  An  isotonic  strength  test  measures  the  maximum  weight  that  can  be  lift¬ 
ed  with  a  single  repetition.  This  is  the  one -repetition  maximum  test  ( 1  -RM).The  equipment  used 
to  measure  1-RM  strength  includes  free  weights  and  progressive  resistance  equipment.  The  most 
difficult  part  of  the  test  is  to  find  the  subject's  maximal  load.  Several  different  weights  will  need  to 
be  tried  to  find  the  proper  1-RM  weight.  Bench  and  leg  press  1-RM  strength  are  common  iso¬ 
tonic  tests  (Gettman,  1993). 

Isokinetic  methods  measure  torque  through  a  defined  range-of-motion  while  keeping  the  speed 
of  movement  constant.  The  equipment  used  to  measure  isokinetic  strength  is  a  load  cell  interfaced 
with  a  computer.The  computer  unit  controls  the  speed  of  movement  and  measures  torque.  This  yields 
the  muscle  group's  torque  curve  for  the  selected  constant  velocity  Both  muscle  strength  and  the  veloc¬ 
ity  of  movement  affect  the  shape  and  magnitude  of  the  curve.  As  the  muscle  contracts  at  a  faster  rate, 
it  cannot  generate  as  much  torque  so  a  lower  curve  is  obtained.  Test  results  from  different  test  centers 
are  not  comparable  unless  the  sites  used  the  same  equipment  and  the  same  test  velocity 

Isokinetic  equipment  is  expensive  and  so  is  usually  used  only  at  well-equipped  testing  centers 
such  as  sports  medicine  and  physical  therapy  facilities.  Over  the  past  few  years  isokinetic  testing  has 
lost  favor.  There  are  several  reasons  for  this.  Major  factors  are  the  cost  of  the  equipment  and  changes 
in  our  health  care  systems.  Managed  health  care  corporations  have  dramatically  reduced  the  money 
they  will  pay  for  strength  evaluations.  Another  reason  is  that  isokinetic  tests  are  largely  open-kinet¬ 
ic  chain,  and  the  current  rehabilitation  philosophy  is  to  use  closed-kinetic  chain.  Isokinetic  tests  are 
typically  used  to  measure  isolated  muscle  groups  like  knee  extension  and  flexion. 


Correlations  Among  Types  of  StrengthTests  —  Although  strength  tests  involve  dynamic  and  stat¬ 
ic  contractions,  they  are  highly  correlated.  In  a  controlled  laboratory  study, Laughlin  (1998)  exam¬ 
ined  the  relationship  between  closed-kinetic  isometric  and  dynamic  1— RMleg  strength.  A  sample 
of  57  healthy  female  athletes  was  administered  isometric  and  isotonic  leg  strength  tests.  A  Cybex 
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leg  press  machine  was  used  to  test  the  athletes'  maximum  isotonic  leg  strength.  Both  the  dominant 
and  nondominant  legs  were  tested.  The  Cybex  unit  was  also  used  to  measure  dominant  and  non¬ 
dominant  isometric  strength  at  30°,  60°,  and  90"  knee  flexion.  Table  4.8  gives  the  results  of  the 
study.  The  correlations  between  the  isometric  and  isotonic  strength  tests  were  very  high.  The  coef¬ 
ficients  ranged  from  0.91  to  0.94.  These  high  correlations  were  expected  because  the  test  positions 
were  duplicated,  thereby  increasing  the  probability  that  the  isometric  and  isotonic  tests  measured 
the  strength  of  the  same  muscle  groups.  These  data  showed  that  the  type  of  muscle  contraction, 
static  and  dynamic,  did  not  affect  the  measurement  of  strength. 


Table  4.8  The  Pearson  product-moment  Correlations  between  static  and  dynamic  leg  strength 


Isotonic  Test 

Isometric  Test 

Dominant  Leg 

Non-Dominant  Leg 

90“  Knee  Flexion 

.94 

.91 

60“  Knee  Flexion 

.93 

.91 

30“  Knee  Flexion 

.93 

.94 

Muscular  Endurance 

Muscular  endurance  is  the  ability  to  persist  in  physical  activity  or  to  resist  muscular  fatigue. 
Endurance  tests  can  measure  absolute  endurance  where  the  power  output  is  the  same  for  all  sub¬ 
jects  tested,  or  relative  endurance  where  the  power  output  varies  among  the  subjects  tested. 
Absolute  endurance  tests  tend  to  be  correlated  with  strength  while  the  correlation  between  relative 
endurance  and  strength  tests  tend  to  be  close  to  zero  (Baumgartner  &Jackson,  1999;  deVries  6c 
Haush,  1994;Jackson,  Osburn,  6c  Laughery,  1984;Jackson,  Osburn,  Laughery,  6c  Vaubel,  1992; 
Jackson,  Osburn,  6c  Laughery,  1991a;  Jackson,  Osburn,  Laughery  Sr.,  6c  Vaubel,  1991b).  There  are 
two  general  types  of  endurance  tests:  laboratory  tests  that  use  an  ergometer  (cycle  or  arm)  to  stan- 
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Table  4. 9  Factor  analysis  of  strength  data  published  by  Dempsey  etal.,  1998 


Test 

Type  of  Strength 

Factor  Loading 

Jsameirjc  15' 

Static 

JO 

Isometric  75' 

Static 

.83 

Isokinetic  0.1** 

Dynamic 

.92 

Isokinetic  0.2** 

Dynamic 

.90 

Isokinetic  0.4** 

Dynamic 

.93 

Isokinetic  0.6** 

Dynamic 

.89 

Isokinetic  0.8** 

Dynamic 

.90 

Incrementally  Lifting 

Dynamic 

.70 

Power 

Dynamic 

.87 

MAI WL 

Psychophysical 

.83 

*  Heightof  Imposition. 
**  Speed  of  movement 


dardize  power  output;  and  field  tests  such  as  push-ups,  pull-ups,  or  sit-ups  that  are  common  items 
of  fitness  test  batteries. 

Ergometer  Tests — An  all-out  cycling  endurance  test  first  described  in  1973  was  called  the  Katch 
test  (McArdle,  Katch,  &.  Katch,  1991).  This  test,  refined  at  the  Department  of  Research  and  Sport 
Medicine  at  the  Wingate  (Israel)  Institute,  is  now  known  as  the  Wingate  anaerobic  power  test 
(Bar-Or,  1987).  This  has  become  the  test  of  choice  for  measuring  anaerobic  endurance.  The 
Wingate  power  test  involves  cycling  as  fast  as  possible  for  30  seconds  at  a  set  resistance,  which  is  a 
proportion  of  the  subject's  body  weight  (Bar-Or,  1987;  Baumgartner  &Jackson,  1999).  Arm 
endurance  can  be  measured  in  the  same  way  with  an  arm  ergometer.  Arm  ergometer  tests  have  been 
used  in  employment  settings  (Laughery  8c Jackson,  1985). 

Field  Tests  —  Push-ups, pull-ups,  flexed-arm  hang,  and  sit-ups  are  common  muscular  endurance 
field  tests.  The  tests  of  this  ability  require  the  subject  to  move  or  support  their  body  weight  against 
the  pull  of  gravity.  This  may  involve  either  isometric  or  isotonic  contractions.  The  tests  are  either 
performed  to  exhaustion  (e.g.,  number  of  pull-ups  completed)  or  for  a  specified  duration  of  time 
(e.g.,  number  of  sit-ups  completed  in  two  minutes).  There  is  a  negative  correlation  between  body 
weight  and  this  basic  physical  ability.  The  correlation  tends  to  be  higher  between  percent  of  body 
fat  and  this  ability  (Baumgartner  &Jackson,  1999). 

Flexibility 

Flexibility  is  the  range  of  movement  about  a  joint.  Individual  differences  in  flexibility  depend 
on  physiological  characteristics  that  influence  the  extensibility  of  the  muscles  and  ligaments  sur¬ 
rounding  ajoint.  Although  it  is  agreed  that  certain  levels  and  types  of  flexibility  are  desirable,  the 
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degree  of  flexibility  desired  is  yet  to  be  determined.  Typically,  a  trunk  flexibility  is  included  as  an 
item  in  fitness  test  batteries  (Baumgartner  &Jackson,  1999;  Golding  et  al.,  1989). 

Flexibility  is  often  regarded  as  a  single  general  factor  or  ability.  Harris  (Harris,  1969)  conduct¬ 
ed  a  factor  analysis  study  to  determine  whether  flexibility  is  a  single  or  general  factor.Two  types  of 
flexibility  tests  were  used  in  this  study — 

1.  tests  that  measure  the  movement  of  a  limb  involving  only  one  joint  action,  and 

2.  composite  measures  of  movements  that  require  more  than  onejoint  or  more  than  one  type 
of  action  within  a  single  joint. 

The  analysis  revealed  many  intercorrelations  to  be  near  zero,  which  implies  specificity  instead 
of  generality.  A  factor  analysis  of  these  data  revealed  1 3  different  factors  of  flexibility.  Harris  con¬ 
cluded,  then,  that  there  is  no  evidence  that  flexibility  is  a  single  general  factor.  These  data  suggest 
that  flexibility  tests  should  duplicate  the  specific  type  of  flexibility  desired. 


Work  Sample  Tests 


This  section  reviews  the  advantages  and  disadvantages  of  work  sample  tests.  A  work  sample  test 
is  designed  to  duplicate  or  simulate  a  critical  work  task  or  a  series  of  important  work  tasks.  The 
advantage  of  a  work-sample  test  is  that  it  simulates  the  actual  working  conditions.  Although  crite¬ 
rion-related  and  construct  validation  methods  can  be  used  to  examine  the  validity  of  a  work  sample 
test,  the  primary  method  used  is  content  validity.  Work  sample  tests  are  commonly  used  to  screen 
applicants  for  police  officer  and  firefighter  jobs.  Arvey  (Arvey,  Nutting,  ScLandon,  1 992)  reported 
that  most  police  and  firefighter  physical  ability  tests  consist  of  some  combination  of  job  sample  tests. 
Examples  of  common  firefighter  work  sample  test  items  (Davis  et  al.,  1992)  are  as  follows — 

Stair  Climb  —  Carry  a  58-pound  hose  bundle  up  5  flights  of  stairs. 

*  Hoseline  Drag— Drag  a  1.75-inch  charged  hose  100  feet. 

*  Rescue  Dummy  Drag — Lift  and  drag  a  175-pound  dummy  100  feet. 

*  Smoke  Extractor  Carry  —  Lift  and  transport  a  47.5-pound  fan  a  distance  of  150  feet. 

*  Kieser  Force  Machine  —  Repeat  pounding  an  object  with  a  sledge  hammer  until  it  is  moved 
a  specified  distance. 

This  last  test  simulates  a  forced  entry  into  a  building. 

The  major  advantage  of  a  good  work  sample  test  is  that  it  has  the  potential  of  enhancing  its 
content  validity  by  duplicating  the  actual  work  task.  To  illustrate,  assume  that  a  job  requirement  is 
to  lift  a  75-pound  box  from  floor-to-knuckle  height.  An  example  of  a  content  valid  work  sample 
test  would  be  one  that  requires  the  subject  to  lift  a  75-pound  box  from  the  floor  and  place  it  on  a 
table.  The  test  would  be  objectively  scored  as  a  pass  or  fail. 
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Although  work  sample  tests  have  excellent  content  validity,  they  also  have  limitations.  In  some 
instances,  work  sample  tests  are  expensive  to  create  and  difficult  to  set  up  and  transport  from  one 
test  location  to  another  site.  For  instance,  a  wooded  structure  was  made  to  duplicate  the  physical 
environment  of  the  cargo  space  of  an  aircraft  used  to  transport  freight  (Jackson,  Osburn,  Laughery, 
Sc  Young,  1993).  Along  with  the  test  administration  difficulties,  Ayoub  (Ayoub,  1982)  maintains 
that  work  sample  tests  have  two  major  limitations.  The  first  limitation  is  safety.  Applicants  seeking 
employment  are  likely  to  be  motivated  to  pass  the  work  sample  test.  Highly  motivated  applicants 
who  lack  the  physical  capacity  to  perform  the  test  are  likely  to  increase  their  risk  of  injury.  Research 
shows  that  the  risk  of  injury  increases  as  the  workers  perform  materials  handling  tasks  that 
approach  their  maximum  capacity  (Dehlin,  Hendenrud,  6c  Horal,  1976;  Herrin  et  al.,  1986; 
Magora,  1970;  Snook,  Campanelli,  6c  Hart,  1978).  As  another  example,  outdoor  telephone  craft 
jobs  require  employees  to  climb  telephone  poles,  and  accident  data  showed  that  this  was  a  danger¬ 
ous  task  (Reilly,  Zedeck,  6cTenopyr,  1979).  Using  a  pole-climbing  test  to  screen  applicants  would 
have  content  validity,  but  would  likely  be  dangerous  for  untrained  employees. 

A  second  limitation  of  work  sample  tests  is  that  they  do  not  give  any  information  about  the  appli¬ 
cant’s  maximum  work  capacity  (Ayoub,  1982).  A  work  sample  test  is  often  scored  by  pass  or  fail  (e.g., 
lifted  a  95 -pound jackhammer  and  carried  it  a  specified  distance).  Some  applicants  can  easily  com¬ 
plete  the  test,  while  others  may  just  pass  and  be  working  near  their  maximum.  If  it  can  be  assumed 
that  there  is  a  linear  relationship  betweenjob  performance  and  the  preemployment  test  performance, 
applicants  with  the  highest  test  scores  can  be  expected  to  be  the  more  productive  workers.  Testing  for 
maximum  capacity  not  only  identifies  the  most  potentially  productive  workers  but  also  provides  the 
opportunity  to  define  a  level  of  reserve  needed  to  reduce  the  risk  of  musculoskeletal  injury. 

Another  potential  limitation  of  work  sample  tests  is  that  there  can  be  a  factor  of  skill  required  by 
the  test.  This  can  be  illustrated  with  an  example.  The  ability  to  use  a  sledge  hammer  is  a  work  sample 
test  item  used  to  select  firefighters.  Sledge  hammering  is  a  physically  demanding  task  in  which  the  abil¬ 
ity  to  deliver  a  forceful  blow  not  only  depends  on  strength  but  also  on  neuromotor  coordination  of  the 
stroke.  If  a  person  has  never  used  a  sledge  hammer,  he  or  she  would  not  have  yet  developed  a  coordi¬ 
nated  stroke.  Part  of  firefighter  training  is  to  develop  the  skill  that  is  used  to  gain  forced  entry. 


i  h  of  Basic  Abilities  n  ork  Sample  ; 

Although  work  sample  tests  can  have  the  advantage  of  being  content  valid,  they  can  be  difficult 
to  set  up  and  time  consuming  to  administer.  To  overcome  this  limitation,  researchers  have  exam¬ 
ined  the  relationship  between  the  basic  ability  tests  and  the  work  sample  tests.  When  the  basic  abil¬ 
ity  tests  and  the  work  sample  tests  were  highly  correlated,  the  basic  ability  tests  replace  the  work 
sample  tests.  Arnold  and  associates  (Arnold,  Rauschenberger,  Soubel,  &  Guion,  1982)  were  among 
the  first  to  demonstrate  that  basic  ability  tests  were  highly  correlated  with  work  sample  tests.  They 
developed  work  sample  tests  for  steelworkers  and  administered  them  to  81  women  and  168  men 
who  were  in  their  first  six  months  of  employment  at  three  different  plant  locations.  In  addition,  the 
worker’s  strength,  flexibility,  agility, balance,  and  cardiorespiratory  endurance  were  tested.  They  dis¬ 
covered  that  arm  dynamometer  strength  was  highly  correlated  with  work  sample  test  performance. 
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The  zero-order  correlations  between  arm  strength  and  work  sample  test  performance  were  consis¬ 
tently  high,  0.82,  0.85,  and  0.85  for  the  three  sites.  Multiple  regression  analysis  showed  that  flexi¬ 
bility,  agility,  balance,  and  cardiorespiratory  endurance  did  not  account  for  an  additional  significant 
proportion  of  work  test  variance.  Although  just  strength  was  correlated  with  the  steelworker  work 
sample  tests,  it  is  important  to  realize  this  may  not  be  true  for  other  work  sample  tests.  There  is 
considerable  evidence  that  VC^max,  strength,  and  fat-free  weight  are  significantly  related  with 
work  sample  test  performance.  Provided  next  is  a  brief  review  of  the  role  of  VC^max,  strength  and 
fat-free  mass  on  work  sample  test  performance. 


Role  of  V02max 

Work  tasks  such  as  climbing  stairs  and  fighting  fires  have  a  significant  aerobic  endurance  com¬ 
ponent.  Published  physiological  research  documents  the  cardiovascular  response  of  these  work 
tasks.  A  tradition  of  work  physiology  has  been  to  define  the  energy  cost  of  work  tasks  with  oxygen 
uptake  (Durnin  &  Passmore,  1967;McArdle  et  al.,  1 99 1 ;  Wi  I  more  8c  Costill,  1994). This  provides 
a  sound  physiological  basis  for  establishing  a  cut-score. 

A  current,  important  research  focus  is  to  define  the  energy  cost  needed  to  fight  fires.  This 
research  focus  can  be  attributed  to  litigation  leveled  at  the  validity  of  firefighter  preemployment 
tests  and  the  use  of  age  to  terminate  employment.  Several  investigators  (Barnard  &Duncan,  1975; 
Lemon  8tHermiston.  1977a;  Lemon  ScHermiston.  1977b;  Manning  Sc  Griggs,  1983; O’Connell, 
Thomas,  Caddy,  8c  Karwasky,  1986;  Sothmann,  Saupe,  Jasenor,  8c  Blaney,  1992)  published  data 
showing  that  fire-suppression  work  tasks  have  a  substantial  cardiovascular  endurance  component. 
In  an  important  study,  Sothmann  and  a  team  of  researchers  (Sothmann  et  al..  1990)provide  strong 
evidence  that  the  minimum  VC^max  required  to  meet  the  demands  of  fire  fighting  is  33.5 
ml/kg/min.  The  authors  used  a  work  sample  test  involving  seven  job-related  firefighter  tasks.  The 
sensitivity  ( percentage  of  correctly  classified  unsuccessful  performers)  and  specificity  ( percentage  of 
correctly  classified  successful  performers)  for  a  VC^max  cut-score  of  33.5  ml/kg/min  was  67  per¬ 
cent  and  83  percent.  Lowering  the  cut-score  to  30.5  ml/kg/min  dropped  the  sensitivity  to  25  per¬ 
cent  and  increased  the  specificity  to  95  percent. 

Because  of  the  expense  involved,  laboratory  VC^max  tests  are  rarely  used  to  make  employ¬ 
ment  decisions.  Rather,  the  1. 5-mile  distance  run  tends  to  be  used  to  evaluate  aerobic  capacity  in 
field  settings.  In  a  recent  firefighter  preemployment  test  (Jeanneret  &Associates,  1999),  the  cor¬ 
relation  between  1 .5 -mile  run  time  and  elapsed  time  to  complete  a  five-item  firefighter  work  sam¬ 
ple  test  was  0.59  (n  =  125),  supporting  the  laboratory  data  showing  the  fire-fighting  ability  was  a 
function  of  aerobic  power. 

Role  of  Strength  and  Fat-Free  Weight 

Strength  is  the  basic  ability  most  often  shown  to  be  associated  with  physically  demanding  work 
tasks  (Hogan,  1991 ).  Since  the  body  composition  component  of  fat- free  weight  consists  mainly  of 
muscle  mass,  the  body's  force-producing  component,  the  correlation  between  strength  and  fat-free 
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mass  is  high.  It  has  been  shown  that  the  correlation  between  isometric  strength  and  absolute  fat- 
free  weight  exceeds  0.80  and  both  are  equally  valid  predictors  of  lifting  capacity  (Jackson,  Borg, 
Zhang,  Laughery,  6c  Chen,  1997). 

Role  of  Strength  on  Work  SampleTests  —  Researchers’  from  the  University  of  Houston  and  Rice 
University,  Houston,  Texas,  completed  a  series  of  preemployment  studies  in  which  one  step  in  the 
research  methodology  was  to  examine  the  relationship  between  isometric  strength  and  work  sam¬ 
ple  test  performance.  Although  the  complete  research  methods  appear  in  technical  reports,  a  rep¬ 
resentative  sample  of  this  research  has  been  published  (Jackson  et  al.,  1999;Jackson,  Borg,  Zhang, 
Laughery,  6c  Chen,  1996a;  Jackson  et  ah,  1997;Jackson  et  ah,  1984;Jackson  et  al.,  1992;Jackson 
et  ah,  1991a).  The  results  of  this  research  are  consistent  with  that  reported  by  Arnold  and 
Associates  (Arnold  et  ah,  1982)  showing  that  muscular  strength  was  an  important  determinant  of 
content-valid  work  sample  tests. 

Table  4.10  gives  the  product-moment  correlation  between  isometric  strength  tests  and  work 
sample  test  performance.  The  strength  tests  were  highly  correlated  with  several  different  types  of 
work  sample  tests.  The  highest  correlations  were  between  isometric  strength  tests  and  static  strength 
work  sample  tests  that  involved  the  capacity  to  generate  force  in  various  body  positions.  These  work 
sample  tests  measured  the  subject’s  capacity  to  generate  push,  pull,  and  lift  force  and  generate  valve 
cracking  force  while  in  common  positions  required  of  refinery  workers.  The  isometric  strength  tests 
were  also  correlated  with  work  sample  absolute  endurance  tests.  The  dynamic  tasks  were  either  per¬ 
formed  to  exhaustion  (e.g.,  valve  turning)  or  at  a  “comfortable  rate”  set  by  the  person  being  tested 
(e.g.,  shoveling  600  pounds  of  material  over  a  3.5-foot  wall).  These  dynamic  work  sample  tests 
involved  valve  turning,  shoveling,  and  repetitive  materials  handling  tasks.  The  correlations  between 
strength  and  the  absolute  endurance  work  tests  ranged  from  0.67  to  0.83.  Besides  these  work  tasks, 
strength  has  been  shown  to  be  related  to  lifting  capacity  (Jackson  et  al.,  1999;Jackson  et  al.,  1997) 
and  firefighter  work  sample  test  performance  (Jeanneret  6c  Associates,  1999).  The  correlations 
between  the  elapsed  time  to  complete  the  five-item  firefighter  work  sample  test,  and  four  isometric 
strength  tests  ranged  from  -0.61  for  leg  and  torso  strength  to  -0.90  for  arm  strength. 

Role  of  Body  Composition  on  Work  Sample  Tests  —  Percent  body  fat  and  fat-free  mass  are  the 
body  composition  variables  most  often  used  to  examine  their  relationship  to  work  sample  test. 
Percent  body  fat  tends  to  be  negatively  correlated  with  work  sample  tests  that  require  the  subjects 
to  move  their  body  mass.  Since  fat-free  mass  is  the  body’s  force-producing  component,  it  tends  to 
correlate  with  work  tasks  related  to  strength.  Skinfold  estimated  percent  body  fat  was  found  to  be 
significantly  related  to  the  ability  of  telephone  workers  to  climb  poles  (Bernauer  6cBonanno,  1975; 
Reilly  et  al.,  1979).  Triceps  skinfold  thickness  was  one  of  the  tests  used  to  screen  applicants  for 
pole-climbing  school.  A  limitation  of  using  skinfold  fat  or  percent  body  fat  in  employment  deci¬ 
sions  in  the  public  sector  is  that  the  chance  of  litigation  increases.  Women  have  about  5  to  lOper- 
cent  more  body  fat  than  men,  which  is  largely  because  of  differences  in  essential  fat  (Lohman, 
1992).  Using  a  single  skinfold  site  such  as  triceps  also  invites  litigation  because  of  male  and  female 
differences  in  fat  patterning  (Jackson  6c  Pollock,  1978;Jackson  et  al.,  1980). 

Hodgdon  and  Associates  (Hodgdon,  1992)  examined  the  relationship  between  body  composi¬ 
tion,  fitness,  and  materials  handling  tasks  required  of  U.S.  Navy  enlisted  men.  The  two  work  sam- 
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Table4.10  Correlations  between  the  Sum  of  Isometric  Strength*  and  Simulated  WorkSample  Tests 


Reference  Work 

Sample  Test 

Type  of  Test 

u  1 

Jackson  and  Osborn  1 983 

One-arm  Push  Force 

isokinetic  -  PeakTorque 

0.91 

Jackson  1 986 

Push  Force 

Static  -  Max  Force 

0.86 

Jackson  et  al.,  1993 

Push  Force 

Static  -  Max  Force 

0.78 

Jackson  1986 

Pull  Force 

Static  -  Max  Force 

0.78 

Jackson  et  al.,  1993 

Pull  Force 

Static  -  Max  Force 

0.67 

Laugbery  and  Jackson  1 984 

Lifting  Force 

Static  -  Max  Force 

0.93 

Jacksonetal,  1998 

valve  Cracking 

Static  -  Max  Force 

0.91 

Jacksonet  al,  1992 

Valve-Turning 

Dynamic  -  Endurance 

0.83 

Jacksonetal,  1993 

Box  Transport 

Dynamic  -  Endurance 

0.76 

Jacksonetal..  1992 

Moving  Document  Bags 

Dynamic-  Endurance 

0.70 

Jacksonet  al,  1991 

Shoveling  Coal 

Dynamic-  Endurance 

0.71 

Jacksonetal.,  1991 

50-pound  Bag  Carry 

Dynamic  -  Endurance 

0  63 

Jackson  and  Osbum  1 983 

70-pound  Bag  Carry 

Dynamic”  Endurance 

0.87 

From  Baumgartner  and  Jackson  (1999) 


pie  materials  handling  tasks  were  the  maximum  weight  of  a  box  that  could  be  lifted  to  elbow  height, 
and  the  total  distance  a  34-kilogram  box  could  be  carried  during  two  5 -minute  work  bouts.  The  vari¬ 
able  most  highly  correlated  with  maximum  box  lift  was  fat-free  mass,  0.84.  The  variables  most  high¬ 
ly  correlated  with  the  box-carry  test  were  push  ups,  0.56;  1.5-mile  run  time,  -0.67;  and  fat-free  mass, 
0.44.  Fat-free  mass  was  highly  correlated  with  muscular  strength  measures  and  suggested  the  possi¬ 
bility  of  using  fat-free  mass  as  an  approximation  of  general  strength  in  job  assignment. 

Vogel  and  Friedl  (Vogel  8c  Friedl,  1992)  examined  the  relationship  between  body  composition 
and  absolute  lifting  capacity.  They  reported  significant  correlations  between  maximum  lifting 
capacity  and  fat-free  mass  for  male  and  female  soldiers.  The  slope  of  the  male  equation  was  2.2 
times  larger  than  the  slope  of  the  women’s  equation,  suggesting  the  lack  of  homogeneity  of  male 
and  female  regression  lines.  Other  research  (Jackson  et  al.,  1997)  also  showed  that  the  relationship 
between  fat-free  weight  and  lift  capacity  was  a  function  of  muscular  strength. 

Need  for  Further  Research 

Published  data  show  that  there  tends  to  be  a  significant  relationship  between  various  basic  abil¬ 
ity  tests  and  work  sample  tests.  These  correlations  range  from  low  to  high,  depending  on  the  tests 
studied.  A  limitation  of  this  body  of  research  is  that  many  basic  ability  tests  tend  to  be  correlated 
with  each  other,  making  it  difficult  to  define  the  true  determinants  of  work  test  performance.  As 
one  example,  percent  body  fat  tends  to  be  significantly  correlated  with  tests  involving  movement  of 
the  body.  This  includes  not  only  running  tests  of  speed  and  aerobic  endurance,  but  also  arm  tests 
such  as  chin-ups  and  push  ups  that  involve  repeatedly  moving  the  body  mass  with  the  arms. 
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A  major  problem  that  needs  further  study  is  the  method  used  to  scale  VC^max.  Maximal  oxygen 
uptake  can  be  expressed  in  either  absolute  or  relative  terms.  Absolute  VC^max  is  the  total  volume  of 
oxygen  the  person  can  consume  during  exhausting  work,  and  is  expressed  in  a  metric  of  milliliters  of 
oxygen  per  minute  (ml/min)  or  liters  of  oxygen  used  per  minute  (L/min).  In  contrast,  relative 
VC^max  is  expressed  in  a  metric  normalized  for  body  weight,  per  kilogram  of  body  weight 
(ml/kg/ min).  The  metric  used  to  express  VC^max  affects  test  validity.  Table  4.1 1  illustrates  the  prob¬ 
lem  and  shows  the  differences  when  VC^max  is  expressed  in  absolute  and  relative  differences.  The 
table  provides  the  correlationsbetween  body  composition  variables  and  VC^max  expressed  in  relative 
and  absolute  terms.  Data  are  provided  for  men  and  women  from  a  very  large  database  of  NASA/fSC 
(Houston,  TX)  employees  (Jackson  et  al.,  1995;Jackson  et  al.,  1996b).  VC^max  was  metabolically 
measured  following  standard  procedures  for  defining  VC^max,  and  skinfold  equations  (Jackson  &. 
Pollock,  1978;Jackson  et  al.,  1980)  were  used  to  measure  the  body  composition  variables. These  data 
show  that  when  VC^max  is  expressed  in  relative  terms  (ml/kg/min),  body  weight,  percent  body  fat, 
and  fat  weight  are  negatively  correlated  with  aerobic  fitness.  In  contrast,  when  expressed  as  absolute 
VC^max  (ml/min),  fat-free  weight  is  the  variable  most  highly  correlated  with  aerobic  fitness. 


Table  4.11  Product-moment  correlationsbetween  body  composition  variables  and  VOgmax  expressed  in 
relative  (ml/kg/min)  and  absolute  (ml/min)  terms 


Variable 

Males  (n 

V02max 

(ml/kg/min) 

=  1,477) 

VOjmax 

(ml/min) 

Females  (n  =  409} 

V02max  V02max 

(ml/kg/min)  (mi/min) 

Weight 

-0.30* 

0.32' 

-0.48' 

;  0.14*  | 

|  %  Fat 

-062’ 

-0.24' 

-0.48" 

0.14'  ! 

Fat-Free  Weight 

-0.04 

0.57 

-0.05 

0.51* 

Fat  Weight 

-0.56' 

-0.06 

-0.67 

-0.16*  1 

*  p<0.01 


An  issue  that  needs  further  research  concerns  the  VC^max  metric  that  gives  the  highest  validity 
of  work  test  performance.The  typical  method  used  to  express  aerobic  fitness  is  in  relative  terms 
(ml/kg/min).  The  data  provided  in  T able  4.11  suggests  that  relative  VC^max  would  be  the  most  valid 
when  examining  work  tasks  that  involve  repetitive  bodily  movement  such  as  climbing  towers,  walk¬ 
ing,  and  running  long  distances. The  high  correlation  between  percent  body  fat  and  relative  VC^max 
should  be  expected  because  transporting  high  levels  of  fat  makes  body  movement  tasks  more  demand¬ 
ing.  It  has  been  demonstrated  that  endurance  work  tasks  of  shoveling,  repetitive  material  transports 
(Jackson  et  al.,  1991a)  and  valve  turning  (Jackson  et  al.,  1991b)  were  correlated  with  strength.  This 
link  can  be  traced  to  the  correlation  between  fat-free  weight  and  absolute  VC^max.  The  correlation 
between  fat-free  weight  and  strength  is  reported  to  exceed  0.80  (Jackson  et  al.,  1997). 

Relative  VC^max  is  the  metric  most  often  used  to  examine  physically  demanding  tasks  such  as 
firefighting  (Sothmann  et  al.,  1990).  Although  firefighting  tasks  involve  repetitive  bodily  move¬ 
ments,  they  also  require  transporting  materials  of  a  constant  weight,  such  as  a  58-pound  hose  bun¬ 
dle.  Assuming  the  same  percent  body  fat  and  relative  VC^max,  transporting  a  constant  weight  load 
would  be  less  demanding  for  someone  with  more  fat-free  mass  than  for  someone  with  less  fat-free 
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mass.  For  the  same  percent  body  fat  and  relative  VC^max,  persons  with  greater  fat-free  mass  have 
a  higher  absolute  VC^max.  This  suggests  that  absolute  VC^max  (L/min  or  ml/min)  would  be  a 
more  valid  test  than  relative  VC^max  (ml/kg/min)  to  evaluate  a  worker's  aerobic  capacity  to  per¬ 
form  repetitive  material  handling  tasks  with  a  constant  load  for  all  workers. 

Davis  and  Associates  (Davis,  1992)  obtained  VC^max  data  on  a  sample  of  25  firefighters  who 
also  completed  a  five-item  work  sample  firefighter  test.  An  analysis  these  data  showed  that  the  cor¬ 
relation  between  the  work  sample  test  and  relative  VC^max  was  -0.23  and  not  statistically  signif¬ 
icant  (p  =  0.28 1).  In  contrast,  the  correlation  between  absolute  VC^max  and  work  sample  test  per¬ 
formance  was  higher,  -0.42,  and  statistically  significant  (p  =  0.036).  This  showed  that  absolute 
VC^max  was  a  more  valid  test  than  relative  VC^max,  suggesting  that  the  determinate  firefighter 
work  sample  test  performance  depended  on  both  aerobic  endurance  and  muscle  mass. 

The  data  from  a  recent  preemployment  firefighter  study  (Jeanneret  &.  Associates,  1999)  pro¬ 
vide  additional  insight  into  these  complex  relationships.  A  total  of  125  firefighters  (31  women  and 
94  men)  completed  the  five-item  work  sample  firefighter  test  used  by  Davis  and  associates  (Davis, 
1992).  The  firefighters'  aerobic  fitness  was  assessed  with  the  1.5-mile  walk/run  test,  which  is  a  field 
test  of  relative  VC^max.  Besides  the  aerobic  tests,  the  firefighters'  percent  fat  was  measured  from 
skinfold  fat  (Jackson  &Pollock,  1978;Jackson  et  al.,  1980)  and  their  strength  evaluated  with  arm, 
shoulder,  torso,  and  leg  isometric  strength  tests  (Baumgartner  &Jackson,  1999). Table  4.12  pro¬ 
vides  the  multiple  regression  analyses  examining  the  role  of  relative  VC^max  and  combination  of 
strength  or  fat-free  weight  on  log  transformed3  firefighter  work  sample  test  performance.  The 
unstandardized  (b)  and  standardized  (13)  regression  equations  are  provided.  One  model  used  1.5- 
mile  combined  with  fat-free  weight,  and  the  other  replaced  fat-free  weight  with  strength.  The  two 
regression  models  produced  very  similar  results.  The  log  transformed  firefighter  work  sample  test 
performance  was  an  independent  function  of  aerobic  fitness  and  muscle  mass.  Fat-free  mass  and 
the  sum  of  the  four  strength  tests  were  two  different  measures  of  muscle  mass  sampled  two  ways, 
by  fat-free  mass  and  the  sum  of  the  four  strength  tests. 


Table  4. 72  Multiple  regression  analysis  comparing  the  effect  of  fat-free  weight  and  strength  in  combination 
with  7.5-mile  run  and  firefighter  work  sample  test  performance 


Variable 

Fat-Free  Model 

b  15 

Strength  Model 

b  15 

intercept 

0.816* 

0.723' 

1.5-mile  Run 

0.053* 

0.497 

0.043* 

0.418' 

Fat-Free  Weight 

-0-005* 

-0.683* 

SFour 

-0.001* 

-0.612’ 

R 

0.91* 

0.85' 

*  p  <  0.01 


Many  physically  demanding  jobs  are  repetitive  materials  handling  tasks.  Often,  tasks  are  termed 
"endurance"or  "strength"  tasks.  The  data  provided  in  this  section  illustrate  that  these  tasks  are  like¬ 
ly  to  be  more  complex  and  a  function  of  a  combination  of  different  basic  abilities.  Additional 
research  is  needed  to  define  the  role  of  aerobic  fitness,  body  composition,  and  muscular  strength 
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variables  on  work  task  performance.  Not  only  would  this  provide  a  better  understanding  of  the 
work  task  performance  it  would  also  provide  the  scientific  value  of  using  combinations  of  basic  abil¬ 
ity  tests  for  selection  and  placement  decisions. 


Relationship  of  Basic  Abilities  on  Health 

A  suitable  level  of  physical  fitness  is  not  only  important  for  performing  work  tasks  but  it  also  is 
important  for  decreasing  the  risk  for  degenerative  diseases  and  work-related  musculoskeletal  injury. 
This  section  provides  a  brief  discussion  on  the  role  of  aerobic  fitness  and  body  composition  on 
health,  and  function  of  strength  on  musculoskeletal  injury.  This  is  covered  in  more  detail  in  the 
health  chapter  of  this  report. 


Aerobic  Fitness  and  Health 

The  classic  study  from  the  Institute  for  Aerobics  Research  (Dallas,  TX)  showed  that  low  aero¬ 
bic  fitness  predicted  high  mortality  rates  (Blair  et  al.,  1989).  The  participants  of  the  study  were 
healthy  men  and  women  who  were  free  from  diseases  such  as  high  blood  pressure  or  diabetes.  After 
a  maximal  treadmill  test,  the  participants  were  followed  for  several  years.  Figure  4.3  provides  a 
graphic  summary  of  the  study.  The  greatest  drop  in  death  rate  was  between  the  lowest  and  moder¬ 
ate  fitness  groups.  The  death  rates  of  the  moderate  and  high  fitness  groups  were  more  similar.  This 
study  shows  the  beneficial  effect  of  a  moderate  level  of  aerobic  fitness  on  mortality.  The  “low  fit” 
group  was  the  20  percent  of  the  men  and  women  with  the  lowest  age-adjusted  aerobic  fitness. 

In  a  second  study,  the  researchers  discovered  that  changes  in  fitness  were  related  to  changes  in 
mortality  risk  (Blair  et  al.,  1995). Those  who  improved  their  aerobic  fitness  by  moving  from  the  low 
to  the  moderate  or  high  categories  reduced  their  future  risk  of  death.  Moderate  fitness  levels  can 
be  attained  for  most  people  who  engage  in  regular  aerobic  exercise  by  doing  the  equivalent  of  walk¬ 
ing  about  three  miles  a  day.  In  a  more  recent  study  (Lee  et  al.,  1999),  low  aerobic  fitness  was  a  more 
powerful  risk  factor  of  all-cause  mortality  than  obesity.  The  Institute  for  Aerobics  data  provide  solid 
epidemiological  evidence  showing  that  low  aerobic  fitness  is  a  major  risk  factor  of  both  cardiovas¬ 
cular  diseases  and  all-cause  mortality. 


Body  Composition  and  Health 

The  relationship  between  weight  and  mortality  is  “J“  shaped  (Lew  8c  Garfinkel,  1979).  Being 
overweight  is  associated  with  many  medical  problems  such  as  hypertension,  diabetes,  and  heart  dis¬ 
ease.  These  illnesses  lead  to  increased  morbidity  and  reduced  longevity.  Overweight  people  are 
more  likely  to  be  diabetic,  hypertensive,  and  have  higher  cholesterol  levels.  Since  these  are  major, 
independent  cardiovascular  disease  risk  factors,  some  believe  that  the  increased  chance  of  cardio¬ 
vascular  disease  associated  with  being  overweight  is  due  just  to  these  other  cardiovascular  disease 
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Men 

Women 


Levels  of  Aerobic  Fitness 


Figure4.3  The  participants  in  the  lowest  aerobic  fitnessgroup  had  the  highest  death  rates.  The  mortality  rate 
of  the  moderate  and  highly  fit  group  was  about  the  same.  The  high  mortality  rate  occurs  mainly  in  the  low 
fitnessgroup.  Graph  made  from  published  data  (Blair  et  ai.,  1989).  From  Baumgartner  and  Jackson  (1999). 

risk  factors.  This  is  not  true.  Being  overweight  puts  you  at  a  higher  risk  of  heart  disease  and  stroke 
(Hubert  et  al.,  1983).  Not  only  did  the  data  from  the  Framingham  study  (Hubert  et  al.,  1983) 
establish  that  obesity  was  a  major,  independent  risk  factor  for  cardiovascular  disease.it  also  showed 
that  gaining  weight  resulted  in  a  higher  risk  while  losing  weight  lowered  health  risk. 

Although  the  relationship  between  obesity  and  cardiovascular  disease  is  well  documented, 
recent  data  suggest  that  obesity  plays  a  role  in  breast  cancer,  the  most  common  cancer  of  women. 
The  causes  of  breast  cancer  are  very  complex,  but  recent  research  showed  that  weight  gain  was  asso¬ 
ciated  with  the  risk  of  breast  cancer.  Medical  researchers  (Huang  et  al.,  1997)  studied  over  95,000 
U.S.  female  nurses  aged  30  to  55.  The  nurses  were  followed  for  16  years.  They  discovered  that 
weight  gain  after  the  age  of  18  years  was  unrelated  to  breast  cancer  incidence  before  menopause, 
but  was  associated  with  it  after  menopause.  Post-menopausal  weight  gain  increased  both  the  risk 
of  breast  cancer  and  mortality.  About  16  percent  of  the  breast  cancers  were  attributed  to  excessive 
weight  gain  (>  20.1  kg).  Avoiding  excessive  weight  gain  during  adulthood  appears  to  be  an  impor¬ 
tant  factor  in  reducing  a  woman’s  risk  of  breast  cancer. 
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Strength  and  Injury 


Industrial  musculoskeletal  injury  is  a  major  problem  associated  with  physically  demandingjobs. 
Of  all  manual  materials  handling  tasks,  lifting  accounts  for  about  50  percent  of  these  injuries 
(Ayoub,  1997;  Snook  et  al.,  1978).  Nearly  25  percent  of  all  worker  compensation  claims  in  the 
United  States  are  related  to  low -back  injuries,  with  an  economic  effect  estimated  to  be  as  high  as 
$20  billion  (Ayoub,  1997).  It  is  believed  by  some  that  the  lack  of  strength  increases  the  risk  of  injury 
associated  with  industrial  manual  materials  handling  tasks. 

Ergonomic  research  suggests  that  it  is  not  only  the  strength  of  the  worker  that  is  involved,  but 
also  the  demands  of  the  task.  In  a  major  retrospective  study, Bigos  and  associates  (Bigos  et  al„  1986) 
reported  that  industrial  musculoskeletal  injury  was  not  related  to  muscular  strength.  The  injury 
rates  of  stronger  workers  did  not  differ  from  their  weaker  counterparts.  Missing  in  their  study  was 
data  on  the  demands  of  the  work  task.  In  contrast,  ergonomic  research  (Dehlin  et  ah,  1976;  Herrin, 
1986;Magora,  1970;  Snook  et  ah,  1978)  has  established  a  relationship  between  industrial  back 
injuries  and  psychophysical  lift  capacity.  Psychophysics  quantify  the  demand  of  physical  tasks  by 
relative  intensity  or  a  percentage  of  maximum  capacity  (Borg,  1998;  Resnik,  1995).  Both  cross-sec¬ 
tional  (Herrin,  1986;Liles,  1984)  and  longitudinal  (Chaffin,  1974;  Chaffin,  Herrin,  &.  Key serling, 
1978)  research  showed  that  the  risk  of  industrial  back  injury  increased  exponentially  as  the  force 
required  to  complete  the  lift  approached  the  employee’s  maximum  isometric  strength  capacity.  In 
a  classic,  epidemiological  retrospective  study,  Snook  and  Associates  (Snook  et  ah,  1978)  showed 
that  workers  were  three  times  more  susceptible  to  low-back  injury  if  they  lifted  weight  loads  psy- 
chophysicallyjudged  to  be  too  heavy  for  75  percent  of  the  industrial  population. 

A  reason  for  including  a  back  flexibility  item  in  an  adult  fitness  test  (Golding  et  ah,  1989)  is 
that  the  lack  of  low  back  flexibility  is  believed  to  be  a  risk  factor  of  low  back  pain.  Plowman 
(Plowman,  1992)  published  a  comprehensive  review  of  the  research  relating  flexibility  and  low  back 
pain.  This  review  did  not  show  that  the  lack  of  flexibility  was  a  risk  factor  of  low  back  pain.  A 
review  of  the  literature  failed  to  provide  evidence  that  flexibility  was  a  valid  predictor  of  physically 
demanding  work  performance. 


Summary 


This  chapter  reviewed  the  types  of  physical  performance  tests  that  can  be  used  for  evaluating  a 
worker’s  capacity  to  do  physically  demanding  work.  In  addition,  a  comprehensive  discussion  was 
provided  of  the  methods  that  can  be  used  to  measure  aerobic  fitness,  body  composition,  strength, 
muscular  endurance,  and  flexibility.  The  tests  most  commonly  used  in  the  public  sector  include 
strength  and  VC^max  tests.  Strength  tests  have  been  shown  to  be  significantly  correlated  with 
many  materials  handling  tasks,  and  VC^max  is  correlated  with  endurance  tasks  such  as  firefight¬ 
ing.  Military  research  has  examined  the  role  of  body  composition  on  materials  handling  tasks  and 
showed  that  fat-free  weight  is  related  to  these  tasks.  Work  sample  tests  duplicate  or  simulate  crit¬ 
ical  work  tasks.  Although  work  sample  tests  can  have  the  advantage  of  being  content  valid,  the  lim- 
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itations  are  that  they  can  increase  the  risk  of  injury  and  do  not  provide  information  about  the  appli¬ 
cant's  maximum  work  capacity.  Another  potential  limitation  of  work  sample  tests  is  that  there  can 
be  a  factor  of  skill  required  by  the  test.  Although  an  applicant  may  not  have  the  skill  to  effectively 
perform  the  task,  with  practice  the  skill  can  be  mastered.  A  promising  approach  is  to  use  basic  abil¬ 
ity  tests  to  define  an  applicant's  physiological  capacity  to  meet  the  demands  of  the  task.  This  strat¬ 
egy  is  discussed  in  more  detail  in  the  validity  chapter.  More  research  is  needed  to  define  the  phys¬ 
iological  determinates  of  physically  demanding  work  tasks.  The  final  section  of  the  chapter  pro¬ 
vides  a  brief  overview  of  the  role  of  aerobic  fitness,  body  composition,  and  strength  on  health  and 
injury.  Epidemiological  research  documents  that  low  aerobic  fitness  and  unfavorable  body  compo¬ 
sition  are  associated  with  higher  levels  of  degenerative  diseases  such  as  hypertension,  stroke,  heart 
disease,  and  diabetes.  Research  has  shown  that  the  risk  of  injury  is  a  function  of  the  combined  effect 
of  worker  strength  and  demands  of  the  task.  The  risk  of  injury  increases  as  the  demands  of  the  task 
approach  the  physiological  capacity  of  the  worker. 

Endnotes 

1.  Many  people  use  the  term  lean  body  mass  or  lean  body  weight  instead  of  fat-free  weight.  Lean  body 
weight  has  a  density  of  less  than  1.100  g/cc  because  it  contains  from  2  to  3  percent  essential  lipid. 
Lohman  (1 992)  maintains  that  fat-free  weight  is  the  proper  term. 

2.  These  investigators  included  Dr.  Andrew  S.  Jackson,  Department  of  Health  and  Human  Performance, 
University  of  Houston;  Dr.  Hobart  Osburn,  Department  of  Psychology,  University  of  Houston;  and 
Dr.  Kenneth  R.  Laughery,  Department  of  Psychology,  Rice  University. 

3.  The  raw  score  work  sample  test  distribution  was  skewed.  A  log  transform  was  used  to  more  closely 
approximate  a  normal  distribution. 
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Abstract 


This  chapter  examines  the  issues  related  to  physical  test  validation  forjob  selection. The  chap¬ 
ter  is  divided  into  three  major  sections.  The  first  examines  issues  and  accepted  methods  of  test  val¬ 
idation.  The  focus  is  on  the  interpretation  of  the  Equal  Employment  Opportunity  Commission 
(EEOC)  guidelines  (EEOC,  1978)  as  they  relate  to  test  validation.  The  sanctioned  validation 
methods  are  content  validity,  criterion-related  validity,  and  construct  validity.  The  measurement 
theory  used  to  evaluate  the  quality  of  employment  tests  is  based  on  the  American  Psychological 
Association  standards  for  validating  educational  and  psychological  tests  (A.P.A,  1985;  A.P.A., 
1987).  A  major  difference  in  physical  test  validation  is  the  use  of  physiological  rather  then  psycho¬ 
logical  tests.  The  second  section  of  the  chapter  examines  the  differences  between  physiological  and 
psychological  test  validation.  The  goal  of  physiological  validation  is  to  define  the  physiological 
capacity  needed  by  a  worker  to  perform  the  work  demanded  by  the  task.  Principal  features  of  the 
physiologicalvalidation  approach  are  the  use  of  a  physiological  metric  to  quantify  test  performance 
and  the  interpretation  of  validity  results  with  relevant  physiological  research  and  theory.  The  final 
section  of  the  chapter  reviews  published  employment  validation  research  on  physical  tests. 


Employment  Selection  Tests 


The  principal  guidance  for  the  design  and  implementation  of  selection  tests  for  employment  is 
the  Uniform  Guidelines  on  Employee  Selection  Procedures  issued  by  the  Equal  Employment 
Opportunity  Commission  in  1978  (EEOC,  1978).These  guidelines  state  that  a  selection  procedure 
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has  “'adverse  impact”  if  the  selection  rate  for  any  group  is  less  than  80  percent  for  the  group  with  the 
highest  selection  rate.  Selection  procedures  that  have  adverse  impact  are  considered  discriminatory 
unless  they  can  be  justified.  A  selection  procedure  that  has  adverse  impact  can  be  justified  if — 

1.  The  tests  or  measures  are  derived  from  a  job  analysis 

2.  The  tests  or  measures  are  indicators  of  critical  or  importantjob  duties,  work  behaviors,  or 
work  outcomes 

3 .  The  tests  or  measures  have  been  shown  to  be  valid  indicators  of  such  duties, behaviors, or  outcomes. 

This  existence  of  a  procedure  forjustifying  selection  tests  is  critical  in  the  area  of  selection  based 
on  physical  abilities.  There  are  well-recognized  differences  in  physical  abilities  between  genders 
(McArdle,  Katch,  &  Katch,  1996),  and  the  development  of  a  physical  abilities  selection  test  for 
physically  demanding  jobs  runs  a  great  risk  of  having  adverse  impact  across  gender. 

The  nature  of  job  analyses,  identification  of  critical  or  importantjob  duties,  and  nature  of  phys¬ 
ical  selection  tests  are  discussed  in  other  sections.  This  section  considers  issues  surrounding  the 
demonstration  of  the  validity  of  selection  tests  or  measures.  As  in  other  sections  of  this  report,  the 
emphasis  is  on  selection  based  on  physical  ability. 

Validity  of  Selection  Tests 

The  extent  to  which  a  fest  or  set  of  tests  measures  what  it  is  meant  to  measure  is  called  the 
validity  of  the  test.  For  the  purposes  of  this  chapter,  validity  is  the  accuracy  with  which  selection 
test(s)  measure  important  work  behaviors  [Jackson,  1994).  The  Uniform  Guidelines  recognize 
three  types  of  validity  with  respect  to  selection  test  development:  content  validity,  criterion-related 
validity,  and  construct  validity. 

Content  Validity — That  a  test  has  content  validity  means  that  the  test  items  reflect  important  ele¬ 
ments  of  thejob.  Thejob  and  test  content  are  linked.  Most  content-valid  test  items  are,  in  fact,job 
samples  or  simulations  of  job  tasks.  Theoretically,  for  the  test  as  a  whole  to  be  content  valid,  the  test 
items  must  sample  all  critical  or  important  duties,  work  behaviors,  or  work  outcomes.  For  example, 
if  ajob  has  two  critical,  physically  demanding  tasks,  one  involving  repeated  lifting  to  a  fixed  height 
and  one  involving  carrying  materials  a  long  distance,  both  tasks  need  to  be  simulated  in  the  con¬ 
tent-valid  selection  test.  Suchjob  sample  tasks  are  usually  scored  as  to  whether  the  applicant  can 
or  cannot  perform  the  task.  Additionally,  for  jobs  that  have  time  constraints,  such  as  emergency 
service  tasks,  there  may  be  time  limits  imposed  for  task  completion.  Successful  completion  of  the 
tasks  qualifies  one  for  thejob.  Content-valid  tests  are  the  most  defensible  tests  because  they  are  the 
most  direct  indicators  of  job  performance  capability.  The  closer  the  simulation  is  to  the  actualjob 
task,  the  more  defensible  it  is  as  a  selection  test. 

Criterion-Related  Validity — A  test  is  said  to  have  criterion-related  validity  when  the  test  items  are 
shown  to  be  estimators  or  predictors  of  critical  or  important  duties,  work  behaviors,  or  work  out¬ 
comes.  Criterion-related  validity  is  usually  expressed  as  a  correlation  coefficient  between  test  per- 
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formance  (the  predictor)  and  performance  of  an  important  or  criticaljob  element  or  behavior  (the 
criterion).  The  criterion  job  element  can  be  any  of  a  number  ofjob  behaviors  including  work-task 
performance,  injury  rates  on  the  job,  absenteeism,  or  peer  or  supervisor  ratings.  Criterion-related 
selection  tests  are  not,  by  definition,  direct  indicators  of  the  ability  to  perform  a  job  or  job  task. 
They  rely  on  a  secondary  relationship  between  the  criterion  task  and  the  predictor  test. 

Two  types  of  criterion-related  validity  can  be  distinguished.  A  test  is  said  to  have  concurrent 
validity  whenever  the  test  is  used  to  predict  a  current  capability.  An  example  i  s  use  of  a  bench  press 
1 -repetition  maximum  (1RM)  is  used  to  predict  an  applicant’s  current  ability  to  lift  a  50-kgbox  to 
elbow  height.  If  the  test  is  used  to  predict  some  future  event,  it  is  said  to  have  predictive  validity. 
An  example  would  be  the  use  of  the  time  to  complete  a  1-mile  run  as  an  indicator  of  future  suc¬ 
cess  in  a  Military  training  program. 

Correlational  studies  are  carried  out  to  demonstrate  criterion-related  validity.  Critical  or  impor¬ 
tant  job  behaviors  are  determined  during  thejob  analysis.  The  nature  of  the  criticaljob  behaviors 
usually  suggests  the  nature  of  the  selection  test  to  be  employed.  If  a  critical  task  requires  lifting,  for 
example,  then  selection  tests  that  measure  strength  would  be  appropriate.  If  the  criticaljob  task 
requires  prolonged  activity,  then  a  test  related  to  endurance,  such  as  a  run  for  time,  might  be  appro¬ 
priate.  Once  candidate  tests  have  been  chosen,  the  tests  are  administered  to  a  sample  of  workers  or 
another  suitable  sample.  Their  performance  on  the  identified  criticaljob  tasks  (or  other  criterion 
measures)  is  also  measured.  The  strength  of  the  associations  between  performance  on  the  selection 
tests  and  performance  on  the  criticaljob  behaviors  is  expressed  as  the  correlation  coefficient,  which 
is  a  measure  of  the  amount  of  common  variance  accounted  for  by  two  measures.  If  the  correlation 
coefficient  between  a  selection  test  performance  and  performance  on  a  criticaljob  behavior  is  suit¬ 
ably  high,  the  selection  test  may  be  used.  It  should  be  noted  that  there  is  no  standard  for  the  min¬ 
imum  acceptable  correlation  coefficient  between  a  selection  test  and  job  behavior.  Statistical  sig¬ 
nificance  is  not  always  a  good  indicator  because  with  large  sample  sizes,  a  correlation  that  explains 
only  a  small  part  of  the  variance  can  be  significant.  That  which  is  possible  or  practical  may  drive 
the  selection  of  an  acceptable  level  of  correlation.  As  a  benchmark,  one  might  note  that  a  correla¬ 
tion  coefficient  of  0.707  indicates  that  50  percent  of  the  common  variance  in  the  relationship  has 
been  explained,  but  this  is  difficult  to  use  as  a  criterion  because  many  things  can  affect  the  size  of 
a  correlation  coefficient.  For  example,  the  size  of  correlation  is  influenced  substantially  by  the  vari¬ 
ability  of  the  sample  tested.  It  is  also  possible  to  have  a  high  correlation  but  considerable  errors  in 
prediction  (Altman  &Bland,  1 983;  Altman  &Bland,  1986). This  subject  is  covered  in  more  detail 
in  another  section  of  this  chapter. 

The  scoring  of  criterion-related  tests  is  based  on  the  achievement  of  critical  performance  levels 
on  the  selection  test(s).  These  critical  performance  levels  can  be  quite  difficult  to  define.  Usually, 
they  are  derived  from  a  mathematical  function  relating  the  predictor  and  criterion  performances. 
The  value  of  the  performance  on  the  selection  test  that  is  associated  mathematically  with  a  critical 
level  of  performance  on  the  importantjob  task  is  used  as  the  cut  off  score  or  cut-score  on  the  selec¬ 
tion  test.  This  critical  level  ofjob  performance  needs  to  be  identified  in  thejob  analysis. This  sub¬ 
ject  is  covered  in  more  detail  in  another  section  of  this  chapter. 

Even  in  the  simplest  case,  when  a  single  critical  task  and  critical  level  of  performance,  and  a  sin¬ 
gle  predictor  measure  are  identified,  it  can  be  difficult  to  set  a  critical  level  of  performance.  This  is 
because  the  relationship  between  performance  on  the  selection  test  and  performance  on  the  crite- 
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rion  task  is  not  perfect.  As  an  example.  Figure  5.1  shows  the  relationship  between  the  maximum 
weight  box  that  can  be  lifted  to  elbow  height  (the  work  tasks)  and  1RM  for  arm-curl  (the  criteri¬ 
on  test).  As  one  can  see,  arm-curl  1RM  and  maximum  box  weight  appear  to  be  strongly  related. 
The  correlation  coefficient  for  this  relationship  is  0.875.  Furthermore,  the  relationship  between  the 
variables  appears  to  be  a  straight  line,  as  suggested  by  the  diagonal  line  crossing  the  figure.  This  line 
represents  the  linear  regression  of  maximal  box  lift  weight  with  arm-curl  lRM.  However,  the 
points  are  scattered  about  the  line.  If  the  critical  task  for  a  particular  job  involved  lifting  a  50-kg 
box  to  elbow  height  (the  value  indicated  by  the  horizontal  line),  the  mean  arm-curl  value  associat¬ 
ed  with  this  box  weight  is  23.4  kg  (the  solid  vertical  line).  This,  ideally,  would  be  the  critical  arm- 


Arm  Curl  1RM  (kg) 

Figure  5. 1  Maximum  box  weight  lifted  to  elbow  height  as  a  function  o farm  curl  1RM 
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curl  1RM  value  that  we  would  pick  if  arm-curl  1RM  were  the  selection  task  for  thisjob.  However, 
it  is  clear  by  inspection  of  Figure  5.1  that  some  individuals  who  lifted  less  than  23.4  kg  on  the  arm- 
curl  could  lift  a  50-kg  box.  These  individuals  are  called  “false  negatives”because  they  failed  the  test 
(and  are  not  selected)  but  can  perform  the  work  task.  In  Figure  5.1,  the  false  negatives  appear  in 
the  upper  left  quadrant  formed  by  the  horizontal  and  vertical  lines  within  the  figure.  Similarly, 
some  individuals  who  lifted  more  than  23.4  kg  could  not  lift  a  50-kg  box.  These  individuals  are 
known  as  “false  positives”because  they  passed  the  test  (and  were  selected), but  cannot  perform  the 
work  task.  In  Figure  5.1,  these  individuals  appear  in  the  lower  right  quadrant.  The  Uniform 
Guidelines  allow  the  exercise  of  a  certain  amount  of  judgment  in  setting  cut-scores.  However,  one 
needs  to  have  a  defensible  rationale.  These  issues  are  examined  in  more  detail  in  the  physiological 
validation  section  of  this  chapter. 

Construct  Validity — Construct  validity  is  the  most  indirect  and  theory-driven  method  of  estab¬ 
lishing  validity.  Construct  validity  exists  when  selection  tests  are  related  to  a  general  trait  or  set  of 
characteristics  (the  construct)  that  is  associated  with  successful  accomplishment  of  important  or 
criticaljob  behaviors.  The  establishment  of  construct  validity  requires  that  employers  show  that  a 
construct  (a  general  trait  or  set  of  characteristics)  is  required  for  satisfactory  job  performance,  and 
that  the  selection  test  or  tests  measure  this  same  construct. 

Constructs  are  often  developed  using  the  statistical  technique  of  factor  analysis  (Rummel, 
1970).  In  factor  analysis,  a  number  of  correlated  variables  are  reduced  to  a  smaller  number  of 
dimensions  or  factors.  Within  the  factor,  each  of  the  included  variables  has  a  coefficient  or  “load¬ 
ing,”  a  numerical  value  indicating  the  strength  of  association  of  that  variable  with  the  factor.  The 
greater  the  loading,  the  greater  the  association  between  the  variable  and  the  factor.  The  factor  is 
defined  mathematically  as  the  sum  of  the  factor  variable  values,  each  multiplied  by  its  loading. The 
variables  with  the  greatest  loadings  drive  the  theoretical  interpretation  of  the  factor. 

Construct  validity  can  be  established  in  three  ways — 

1.  Performances  on  job  behaviors  can  be  analyzed  to  determine  dimensions  within  thejob. 
Scores  on  selection  tests  can  then  be  shown  to  be  correlated  with  thejob  dimensions. 

2.  Scores  on  selection  tests  can  be  factor  analyzed,  and  dimensions  within  the  selection  tests 
identified.  A  number  of  examples  of  such  analyses  can  be  found  in  the  literature 
(Fleishman,  1964;  Hogan,  1991a;  Meyers,  Gebhardt,  Crump,  8c  Fleishman,  1984) 

3.  Factor  scores  from  the  dimensions  of  the  selection  tests  can  be  shown  to  be  correlated  to 
performance  on  important  job  behaviors.  Both  potential  selection  test  items  and  per¬ 
formance  on  important  job  behaviors  can  be  factor  analyzed.  A  validity  study  can  then  be 
carried  out  to  analyze  the  associations  between  the  selection  factors  and  thejob  factors. 

These  options  are  indicated  schematically  in  Figure  5.2. 

Figure  5.2  is  an  oversimplified  version  of  the  actual  situation.  Often,  more  than  one  construct 
is  present  in  thejob  behaviors.  For  example,  strength  and  endurance  maybe  required  for  job  suc¬ 
cess.  In  such  a  case,  many  more  relationships  must  be  worked  out  in  the  validity  study. 

The  conduct  of  a  study  to  demonstrate  construct  validity  is  similar  to  that  for  criterion -related 
validity  except  that  instead  of  a  one-to-one  mapping  of  performance  on  a  selection  test  to  perform- 
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Figure  5.2  Three  experimental  designs  for  construct  validity  studies.  Single-ended  arrows  indicate  vari¬ 
ables  included  in  the  factor  analysis.  Double-ended  arrows  indicate  correlations  to  be  measured. 

ance  on  ajob  behavior,  several  selection-testitems  are  measured  that  are  used  to  calculate  factor  scores 
to  represent  the  selection  constructs  being  measured,  and/or  severaljob  behaviors  are  measured  to  cal¬ 
culate  factor  scores  to  represent  the  job  constructs  being  measured.  It  is  these  factor  scores  that  are 
used  in  the  correlational  analysis.  Construct-validity  relationships  are  often  difficult  to  demonstrate 
because  of  the  need  to  identify  the  factor  structures  in  the  job  and  selection  tests  and  then  establish 
associations  between  or  among  them.  Given  these  difficulties,  many  employers  choose  to  use  the 
measures  of  underlying  constructs  directly  as  elements  of  criterion-related  validity  studies. 


Requirements  for  Validity  Studies 

The  Uniform  Guidelines  provide  general  and  technical  standards  for  validity  studies.  Among 
the  general  standards  are  the  following — 

In  addition  to  specifying  the  three  types  of  studies  (content,  criterion-related,  and  con¬ 
struct-validity),  the  guidelines  require  the  studies  to  be  consistent  with  applicable  profes¬ 
sional  standards  for  such  research,  accurate  and  free  from  bias. 

*  The  validity  studies  should  be  documented. 

The  employer  must  be  prepared  to  justify  the  method  used  to  implement  the  selection  tests. 
If  use  of  a  test  has  greater  adverse  impact  when  used  as  a  ranking  device  than  if  it  were 
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implemented  as  a  simple  pass/fail,  then  the  employer  must  provide  sufficient  evidence  of  the 
validity  and  utility  to  support  use  of  the  test  to  rank-order  participants. 

*  Selection  procedures  may  be  developed  for  higher  level  jobs  in  cases  where  most  of  the 
entry-level  applicants  will  progress  to  those  higher  leveljobs. 

*  An  employer  may  continue  to  use  selection  procedures  for  which  there  is  not  yet  full  v  alidity 
evidence  as  long  as  the  employer  has  evidence  of  the  substantial  validity  of  the  procedures  and 
will  conduct,  when  technically  feasible,  a  study  to  produce  the  additional  evidence  required. 

*  Employers  may  also  use  validity  studies  conducted  by  others  when  it  can  be  shown  that  the 
validity  studies  were  conducted  properly  and  that  the  jobs  perform  substantially  the  same 
major  work  behaviors  for  the  employer  as  for  those  who  conducted  the  study. 

Employers,  labor  organizations,  and  employment  agencies  are  encouraged  to  work  togeth¬ 
er  and  cooperate  in  validity  studies. 

*  Finally,  under  no  circumstances  will  the  general  reputation  of  a  test  or  other  selection  pro¬ 
cedures  or  casual  reports  of  its  validity  be  accepted  in  lieu  of  evidence  of  validity. 

The  minimum  technical  standards  called  for  in  the  guidelines  of  all  tests  are  that  validity  stud¬ 
ies  should  be  based  on  review  of  information  about  thejob  (ajob  analysis).  The  technical  standards 
differ  somewhat  for  the  type  ofvalidation  study  Tables  5.1, 5.2,  and  5.3  summarize  these  standards 
by  validation  method. 

Table  5.1  EEOC  Technical  standards  Guidelines  for  the  criterion  validation  method 


Technical  Standard  for  Criterion-Related  Validation  Studies 


1 .  The  study  must  betechnicallyfeasible.  It  must  be  possible  to  get  an  adequate  sample  sire  to  provide  a  scientifically  sound  result. 

However,  an  employer  is  not  required  to  hire  or  promote  individuals  in  order  to  be  able  to  conduct  a  criterion-related  study. 

2.  Whether  the  study  is  to  be  concurrent  or  predictive,  the  sample  subjects  should  be  representativeof  the  individuals  who  might  reasonably  be  expected 
to  till  the  positions  being  studied. 

3.  In  general, the  guidelines  indicate  the  finding  of  a  significance  level  P  <  0.05  to  be  acceptable. 

4.  However,  usersshould  evaluate  each  selection  procedureto  assure  that  it  is  appropriate  for  operational  use.  In  general,  the  greater  the  magnitudeof  the 
correlationsfound  between  the  job  behaviorsand  the  tests,  and  the  greater  the  number  of  job  behaviors  predicted  by  a  particular  test,  the  more  appropriate 
it  is  for  implementation.  Selection  procedures  derived  from  studies  with  large  sample  sizes  and  low  correlations,  and  sole  reliance  on  a  selection  instrument 
that  is  relatedto  only  one  of  many  critical  job  behaviors  will  be  subject  to  close  review. 

5.  Users  must  avoid  use  of  techniques  that  can  leadto  inflatedvalidities  for  selection  procedures.  Examples  include  relianceon  afew  selection 
procedures  or  criteriawhen  many  were  studied,  and  use  of  the  statistics  from  one  sample  when  they  may  not  have  held  up  well  on  cross-validation. 

The  Guidelinesrecommendlarge  samples  and  useof  cross-validation. 

6.  The  Guidelinescaii  for  the  maintenanceof  'fairness'  in  selection  procedures.  Essentially,  unfairness  results  when  members  of  one  group  characteristically 
obtain  lower  scores  on  a  selection  procedurethan  members  of  another  group,  but  the  differences  in  scores  on  the  selection  instrumentare  not  manifest  in 
differences  in  job  performance.  The  guidelinescaii  for  investigation  of  the  fairness  of  selection  procedures  whenever  a  selection  device  has  adverse  impact. 
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Table  5.2  EEOC  Technical  standards  Guidelines  lor  the  content  validation  method 


Technical  Standard  for  Content  Validation  Studies 


1.  Considerationmust  be  given  to  the  appropriatenessof  contentvalidity  strategy.  Such  aStrategy  is  not  appropriate  when  the  job  tasks  represent  knowledge, 
skills,  and  abilities  that  an  employee  is  expected  to  learn  on  the  job.  It  is  also  not  appropriate  for  demonstrating  the  validity  of  selection  procedures 

that  claimto  measure  traits  or  constructs  such  as  intelligence,  aptitude,  personality,  common  sense.judgment,  and  leadership. 

2.  The  job  analysis  must  locus  on  the  importantwork  behaviors, their  relative  importanceacross  all  behaviors,  and  the  productsot  such  work  behaviors. 

To  be  included  in  a  work  sample,  the  behaviors  must  be  observable,  and  some  aspect  of  them  must  be  measurable.  The  work  behaviors  selected 

for  measurementshould  be  criiical  and/or  importantwork  behaviors  that  constitute  most  of  the  job. 

3.  To  demonstrate  content  validity  of  a  selection  procedure,  it  must  be  shown  that  the  behaviors  are  a  representativesample  of  behaviorsof  thejob 
or  that  the  selection  procedure  offers  a  representativesample  of  the  work  product  of  the  job.  For  selection  procedures  measuring  a  skill  or  ability, 
the  procedures  must  closely  approximate  an  observablework  behavior  or  work  product.  The  closer  the  content  and  the  context  of  the  selection  tests 
are  to  work  samples  and  work  behaviors.the  more  suitable  they  are  for  showingcontent  validity. 

4.  Whenever  feasible,  measurementof  the  reliability  of  the  Selection  procedures  should  be  carried  out. 


Table  5.3  EEOC  Technical  standards  Guidelines  for  the  construct  validation  method 


Technical  Standard  for  Construct  Validity  Studies* 


1.  The  Guidelinesrecognizethat  establishmentof  constructvalidity  is  a  morecomplexstrategythan  either  content  or  criterion-relatedvalidity, 
and  that  there  was,  at  the  time  of  Guidelines'  publicatian,  a  lack  of  literature  extendingthe  conceptto  employment  practices. 

2.  Therefore,  the  job  analysis  must  be  carried  out  in  afashion  that  allows  the  identificationof  constructs  underlying  the  importantjob  behaviors 
Each  constructdiscovered  should  be  named  and  defined  to  distinguish  it  from  all  other  constructsso  discovered. 

3.  Selectionproceduresshould  then  be  developed  or  identifiedthat  measure  the  work  behavior  constructs.  The  users  must  then  show  that  the  selection 
procedures  are  related  to  the  work  behavior  constructs  and  that  the  work  behavior  constructs  are  validly  relatedto  the  performance  of  importantor 
critical  work  behaviors. 

4.  The  Guidelinesallow  limited  use  of  constructvalidity  studies.  "Until  such  time  as  professional  literatureprovides  more  guidance  on  the  use  of 
constructvalidity  in  employment  situations,  the  Federal  agencies  will  accept  a  claim  of  constructvalidity  without  a  criterion-relatectetudy ... 
only  when  the  selection  procedure  has  been  used  elsewhere  in  a  situation  in  which  a  criterion-related  study  has  been  conducted  and  the  use 
of  a  criterion-related  validity  study  in  this  context  meets  the  standardsfor  transportability  of  criterion-relatedvalidity  studies  set  forth  above. 


*  see  Figure52 


Physiological  Validation 


The  validation  models  identified  in  the  EEOC  Guidelines  (EEOC,  1978)  are  based  on  the 
American  Psychological  Association  standards  for  validating  educational  and  psychological  tests 
(A.P.A,  1985;  A.P.A.,  1987).  A  major  difference  when  validating  physical  tests  is  the  use  of  physio¬ 
logical,  not  psychological,  tasks.  Physiological  tests  differ  from  educational  and  psychological  tests. 

The  goal  of  physiological  validation  is  to  match  the  worker  with  the  physiological  demands  of 
the  job.  A  n  essential  element  of  this  process  is  the  quantification  of  the  task's  physiological  stress. 
The  recent  court  ruling  of  Lanning  v.  SEPTA  (U.S.  3rd  Circuit  1999)  gives  legal  support  to  phys¬ 
iological  validation. The  case  is  discussed  in  greater  detail  in  Chapter  7  of  this  State  of-the-Art 
Report  (SOAR).  A  key  issue  in  the  Lanning  v.  SEPTA  case  was  setting  a  valid  aerobic  fitness  cut- 
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score. The  recommended  cut-score  represented  a  VC^max  of  42. 5  ml/kg/min.The  court  ruled  the 
standard  to  be  unacceptable  because  the  test  developers  failed  to  identify  the  minimum  aerobic 
capacity  demanded  by  the  job. 

The  tradition  and  “standard  practice” used  to  validate  criterion-related  physiological  tests  is  to 
use  the  metric  of  the  dependent  variable  (i.e.,  the  criterion  test)  as  the  basis  for  evaluating  a  sub¬ 
ject's  work  capacity,  which  is  sampled  with  the  predictor  test.  The  metric  of  the  criterion  variable 
has  physiological  significance. This  physiological  test  validation  methodology  is  clearly  illustrated 
with  body  composition  and  VC^max  concurrent  test  validation  research.  To  illustrate,  in  1951, 
Brozek  and  Keys  (1951)  not  onlyreported  the  concurrent  validity  coefficient  between  the  predic¬ 
tor  test,  skinfold  fat,  and  the  criterion  variable,  hydrostatically  measured  percent  body  fat,  but  also 
published  the  first  regression  equation  providing  a  valid  model  to  interpret  a  subject’s  skinfold  fat 
measurement  by  the  more  meaningful  metric  of  percent  body  fat.  As  another  example,  the  maxi¬ 
mum  treadmill  test  following  a  standard  protocol  is  a  method  of  measuring  VC^max.  These  con¬ 
current  validation  studies  (Bruce,  Kusumi,  &  Hosmer,  1973;  Foster,  Jackson,  &  Pollock,  1984; 
Pollock  et  al.,  1976)  published  a  regression  equation  with  functions  to  estimate  VC^max 
(ml/kg/min)  from  treadmill  time.  The  metric  used  to  interpret  aerobic  fitness  is  VC^max,  not 
elapsed  treadmill  time.  The  next  section  of  this  chapter  examines  differences  in  the  validation  of 
physiological  and  psychological  tests. 


Differences  fi  Physiological  and  f  I  i  Test  \  lii; 

Although  the  psychological-based  validation  strategies  outlined  in  the  EEOC  Guidelines  are 
suitable  for  validating  physical  tests,  there  are  at  least  three  important  differences. These  include  the 
test  metric  used,  the  work  task  definition,  and  the  matching  of  the  worker  to  the  demands  of  the  task. 

Test  Metric — The  first  major  difference  between  psychological  and  physiological  tests  is  the  test’s 
metric.  Typically,  the  metric  of  physiological  tests  is  a  ratio  measurement  scale.  In  contrast,  scaling  of 
psychological  tests  is  either  ordinal  or  interval.  The  units  of  measurement  of  physiological  tests 
include  percent  body  fat,  oxygen  uptake,  caloric  expenditure, force  exerted,  pounds  lifted,  weight  load 
transported,  and  various  types  of  power  output,  to  name  a  few.  The  unit  of  measurement  has  physio¬ 
logical  significance. In  contrast,  the  unit  of  measurement  of  psychological  tests  is  typically  an  indi¬ 
vidual's  response  on  a  knowledge  test  or  response  to  some  type  of  scale  (e.g.,  Lickert  scalej.The  unit 
of  measurement  on  psychological  tests  is  of  little  importance.  This  is  evidenced  by  the  common  prac¬ 
tice  of  transforming  scores  on  psychological  tests  from  the  original  metric  into  some  form  of  standard 
score  with  a  known  mean  and  standard  deviation,  such  as  500  and  100.  The  person’s  score  is  inter¬ 
preted  relative  to  the  mean  and  standard  deviation  of  the  test.  In  contrast,  a  physiological  test  is  not 
only  interpreted  with  the  mean  and  standard  deviation  of  a  population,  but  the  value  can  also  have  an 
important  physiological  meaning.  For  example,  a  VC^max  of  20  ml/kg/min  not  only  signifies  a  per¬ 
son  has  low  fitness  by  normative  standards  but  also  indicates  that  the  person  lacks  the  physiological 
capacity  to  perform  work  tasks  with  an  energy  cost  that  exceeds  the  person’s  low  aerobic  capacity. 
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Accurate  Quantification  of  Work  Demands  —  A  characteristic  of  physiological  test  validation  is 
that  the  physical  demands  of  work  tasks  can  often  be  objectively  measured.  This  is  because  of  the 
capacity  to  define  the  physical  demands  of  the  work  task.  Extensive  physiological  research  has 
defined  the  energy  expenditure  of  a  host  of  occupational,  recreational,  and  fitness  tasks  by  measur¬ 
ing  oxygen  consumption  while  doing  the  tasks  (Durnin  Sc  Passmore,  1967;  Passmore  Sc  Durnin, 
1955).  These  energy-cost  tables  are  published  in  basic  exercise  physiology  texts  (Astrand  ScRodahl, 
1986;  Brooks  &  Fahey,  1984;  McArdle,  Katch,  &  Katch,  1991;  Wilmore  &  Costill,  1994).  The 
forces  required  to  “crack”  valves  and  push  or  pull  objects  can  be  measured  with  torque  wrenches  and 
electronic  load  cells  (Jackson,  Osburn,  Laughery,  &  Sekuls,  1998;Jackson,  Osburn,  Laughery,  Sc 
Vaubel,  1992). The  demands  of  materials-handling  tasks  can  be  defined  by  weight  load,  type  of  lift, 
lift  rate,  and  distance  transported  (Jackson,  Osburn,  Laughery,  Sc  Young,  1993a;  Waters  et  al., 
1999;  Waters,  Putz-Anderson,  Garg  Sc  Fine,  1993). These  objective  data  define  the  physiological 
stress  demanded  by  work  tasks. 

Match  the  Worker  to  the  Physiological  Demands  of  the  Task — A  final  difference  between  physi¬ 
ological  and  psychological  test  validation  is  the  capacity  to  match  the  worker  to  the  physiological 
demands  of  the  work  task.  Once  the  demands  of  the  work  task  are  known,  the  next  step  of  a  phys¬ 
iologically-based  validation  strategy  is  to  determine  if  a  worker  has  the  capacity  to  meet  the 
demands  of  the  task.  This  was  the  method  used  to  define  the  minimum  energy  cost  (i.e.,  VC^max) 
required  for  fire-fighting  (Sothmann  et  al.,  1990).  This  research  showed  individuals  with  a 
VC^max  below  33.5  ml/kg/min  were  unable  meet  the  demands  of  firefighting.  A  goal  of  ergonom¬ 
ic  research  has  been  to  define  the  strength  levels  needed  to  do  industrial  tasks  safely  (Keyserling  et 
al.,  1980;  Keyserling,  Herrin,  8c  Chaffin,  1980).  The  next  sections  of  this  chapter  discuss  these 
methods  in  more  detail. 

Physiological  Validation— Test  Fairness 

The  goal  of  a  physiological  criterion-related  strategy  is  not  only  to  estimate  the  validity  of  the 
test  but  also  determine  the  minimum  physiological  level  required  by  the  task.  A  second  important 
element  of  this  approach  is  the  physiological  interpretation  of  the  obtained  data  analyses. 
Interpretation  of  the  statistical  results  of  validation  research  with  relevant  physiological  theory  and 
published  research  provides  a  scientific  rationale  to  explain  the  results.  Failure  to  do  this  leaves  the 
validation  results  open  to  question. 

An  important  issue  to  resolve  in  a  criterion-related  study  is  whether  the  preemployment  test  is 
fair.  Unfairness  is  defined  as  a  situation  in  which  members  of  a  protected  group  obtain  lower  scores 
on  a  preemployment  test  than  members  of  another  group,  but  the  difference  in  scores  is  not  reflect¬ 
ed  in  differences  in  the  criterion  of  job  performance  (EEOC,  1978). This  is  called  the  Cleary  test 
of  fairness  and  is  affirmed  by  showing  that  the  regression  line  that  defines  the  relationship  between 
the  preemployment  test  and  the  criterion  is  common  to  both  groups.  The  statistical  procedure  is  to 
test  for  homogeneity  of  regression  slopes  and  intercepts  (Arvey  Sc.  Faley,  1988;  Jackson,  1989; 
Pedhauzur,  1997).  The  literature  provides  examples  of  the  use  of  this  test  (Arnold,  Rauschenberger, 
Soubel,  &  Guion,  1982;  Reilly,  Zedeck,  Sc Tenopyr,  1979). 
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Although  the  Cleary  test  may  evaluate  the  fairness  of  an  employment  test,  the  analyses  can  also 
provide  a  physiological  interpretation  of  the  employment  test.  The  Cleary  test  is  the  method  of 
determining  whether  a  common  regression  equation  can  be  used  to  explain  the  relationship 
between  the  predictor  and  criterion  tests  of  two  groups.  In  physical  test  validation,  the  two  groups 
are  typically  male  and  female  applicants.  The  data  analysis  strategy  is  first  to  determine  whether 
the  two  groups  share  a  common  regression  slope  and  then  decide  whether  the  groups'  regression 
intercepts  are  within  chance  variation.  Multiple  regression  is  the  statistical  model  used  to  test  for 
fairness.  This  multivariate  analysis  involves  dummy-coding  the  group  variable  (e.g.,  female  =  0, 
male  =  l)and  forming  a  group  by  predictor  test  interaction  term  (Pedhauzur,  1997). The  statisti¬ 
cal  strategy  used  is  to  generate  a  full  multiple  regression  consisting  of  the  three  variables — 

1.  a  predictor  test 

2.  a  dummy-coded  group  variable,  and 

3.  an  interaction  term,  which  is  the  product  of  the  group  and  test  variables. 

The  next  step  is  to  generate  two  restricted  regression  models:  the  first  with  two  independent 
variables,  the  group  variable  and  the  predictor  test;  and  second,  with  just  the  predictor  variable. The 
statistical  test  used  to  evaluate  group  differences  in  slopes  and  intercepts  is  to  evaluate  changes  in 
R2  between  the  full  and  restricted  models.  Pedhauzur  (1997)  outlines  these  statistical  methods  and 
tests  of  significance.  These  methods  are  illustrated  next  with  physiological  data.  Also  shown  are  the 
role  and  importance  of  the  physiological  interpretation  of  the  results. 

Croup  Difference  in  Regression  Slopes — A  task  analysis  of  freight  mover  tasks  showed  that  rap¬ 
idly  moving  packages  from  a  container  to  a  conveyor  belt  was  a  physically  demanding  task  (Jackson 
et  al.,  1993a).  A  work-sample  test  was  developed  to  duplicate  the  demands  of  this  repetitive  trans¬ 
port  task.  The  task  involved  moving  packages  that  ranged  in  weight  from  about  15  to  80  pounds. 
The  distribution  of  package  weights  was  representative  of  the  weight  distribution  encountered  by 
workers.  A  work-sample  test  duplicated  work  demands  of  the  task.  Exercise  heart  rate  was  meas¬ 
ured  to  ensure  the  work  rate  of  the  simulation  test  was  representative  of  the  actual  work  rate.  The 
subjects  were  instructed  to  work  at  a  brisk  rate  consistent  with  their  fitness  and  not  to  move  pack¬ 
ages  that  exceeded  their  capacity. 

Figure  5.3  is  the  bivariate  relationship  between  the  predictor  test  (sum  of  isometric  strength) 
and  the  criterion  test  (materials  transport,  expressed  in  a  metric  of  power  output,  the  pounds  of 
freight  transported  per  minute).  The  data  are  contrasted  by  gender.  Analysis  of  these  data  showed 
that  male  and  female  regression  lines  were  not  parallel.  The  R2  change  between  the  full  model  and 
restricted  model  of  the  strength  test  and  dummy-coded  gender  variable  was  0.04,  which  was  sta¬ 
tistically  significant  (Fu.  i99>  =  18.96  p  <  0.01).  The  graph  shows  that  the  slope  for  the  female  sub¬ 
jects  (0.534)  is  more  than  twice  as  steep  as  the  slope  for  male  subjetcs  (0.208). 

A  strict  interpretation  of  the  Cleary  test  would  indicate  that  the  strength  test  was  unfair,  but  a 
physiological  interpretation  of  the  data  gives  a  clearer  view.  Post  hoc  examination  of  the  data 
showed  that  many  females  could  not  lift  and  transport  the  heavier  packages.  The  lift  weight  exceed¬ 
ed  their  strength  capacity.  The  steeper  female  slope  showed  that  individual  differences  in  strength 
were  more  important  for  females  than  males.  The  stronger  women  could  lift  the  heaviest  weight 
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Male:  Y*  =  173.772  +  (0.208  x  Strength)  R2  =  .23 
Female:  Y*  =  18.332+  (0.534  X  Strength)  R2  =  .55 

Figure  5.3  Test  for  fairness,  example  of  significant  differences  in  male  and  female  regression  slopes 

loads  while  the  weaker  women  could  not.  A  major  determinant  of  the  female  capacity  to  move 
freight  was  the  subject’s  strength-dependent  capacity  to  lift  heavy  loads.  In  contrast,  most  men  had 
the  physiological  capacity  to  lift  and  transport  the  heaviest  loads.  These  physiological  data  would 
be  important  information  for  setting  a  cut-score  consistent  with  the  demands  of  the  task.  The  data 
could  also  have  important  ergonomic  implications  that  could  lead  to  job  redesign,  such  as  a  com¬ 
pany  policy  limiting  the  weight  of  packages  they  would  transport. 

Intercept  Differences  —  The  second  part  of  the  Cleary  test  is  to  evaluate  differences  in  regression 
intercepts.  Figure  5.4  shows  a  physiological  example  of  intercept  differences  in  the  form  of  the  scat- 
terplot  of  published  male  and  female  body  composition  data  (Jackson  &  Pollock,  1978;  Jackson, 
Pollock,  &Ward,  1980). The  independent  variable  is  the  sum  of  seven  skinfold  measurements,  and 
the  dependent  variable  is  percent  body  fat  measured  by  the  underwater  weighing  method.  The  fig¬ 
ure  shows  that  the  slopes  of  the  male  and  female  regression  lines  are  parallel;  the  differences  in  slope 
are  within  random  variation  (F<i.67S)  =  1.25;p  >  0.05).  The  R2  difference  between  the  full  model  and 
restricted  model  with  gender  and  the  sum  of  skinfolds  was  0.0004.  Adding  the  dummy-coded  gen¬ 
der  variable  to  the  sum  of  skinfolds  accounted  for  more  than  12  percent  of  percent  fat  variance 
(Fa, 675)  =  398.75;  p  <  0.01).  As  these  data  show,  the  significant  intercept  difference  indicates  that  for 
a  given  score  on  the  predictor  test  (sum  of  skinfold  fat),  the  criterion  score  of  one  group  can  be 
expected  to  be  systematically  higher,  which  in  this  instance  is  measured  percent  body  fat.  The 
regression  lines  differed  by  an  average  percent  body  fat  of  about  6  percent. 
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Female: %FAT=  6. 392 +(0.1 43  x  *7)  R2=  .70 
Male:  %FAT  =  1.324  +  (0.135  x  -7)  R2-  .78 

F/gure  5.4  Test  for  fairness,  example  of  parallel  regression  slopes,  but  significant  differences  in  male  and 
female  regression  intercepts 

A  “blind”  application  of  the  Cleary  test  would  indicate  that  the  test  was  unfair.  A  physiological 
interpretation  of  these  results  provides  a  clear  rationale  for  the  intercept  difference.  Skinfold  fat 
measures  subcutaneous  fat,  but  the  body  has  two  types  of  fat,  subcutaneous  and  essential  fat. 
Hydrostatically  determined  percent  body  fat  measures  both  sources  of  body  fat.  It  is  well  estab¬ 
lished  that  the  essential  fat  of  women  is  greater  by  about  7  percent  of  body  mass  than  that  of  men 
(McArdle  et  al.,  1996). The  physiological  explanation  for  the  gender  difference  in  intercepts  can  be 
explained  by  differences  in  essential  fat. 

Although  this  body  composition  example  does  not  represent  a  work-sample  test,  the  use  of 
body  composition  tests  has  been  an  interest  ofMilitary  researchers  (Marriott,  1992). It  is  well-doc¬ 
umented  that  percent  fat  is  inversely  related  with  strenuous  tasks  that  involve  moving  the  body. 
This  body  composition  example  shows  that  if  percent  body  fat  is  used  to  evaluate  male  and  female 
performance  on  common  physical  tasks  (e.g.,  running,  climbing),  the  test  must  to  be  expressed  in 
the  physiological  metric  of  percent  body  fat,  not  the  sum  of  skinfold  fat.  In  contrast,  if  the  goal  is 
to  evaluate  fitness  rather  then  the  capacity  to  meet  the  demands  of  a  work  task,  gender-based  stan¬ 
dards  are  appropriate  (Gettman.  1993). 

Common  Slope  and  Intercept —  The  example  provided  in  this  section  illustrates  the  homogeneity 
of  male  and  female  regression  lines  for  the  predictor  and  criterion  tests.  Figure  5.5  gives  the  scat¬ 
ter  plot  of  the  male  and  female  relationship  between  isometric  strength  and  peak  push  force.  A  task 
analysis  showed  that  push  force  was  a  physically  demanding  task  required  of  workers  who  moved 
freight  containers  (Jackson  et  al.,  1993a).  The  mean  push  force  of  the  males  was  124.6  (SD  =  42.2) 
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Figure  5.5  Testforfairness,  example  of  homogeneity  of  male  and  female  regression  slopes  and  intercepts 

compared  with  a  mean  of  70.0  (SD  =  28.3)  for  the  females.  This  difference  was  statistically  signif¬ 
icant  (Fa, 205)  =  99.89;  p  <  0.01).The  figure  shows  that  the  male  and  female  regression  lines  are  sim¬ 
ilar.  Statistical  analysis  showed  the  slopes  (Fa,205>  =  1.50;p  >  0.05)  and  intercepts  (F<uo3>  =  2.00  p  > 
0.05)  of  the  male  and  female  regression  lines  were  not  statistically  significant.  The  group  and 
group-by-strength  variables  accounted  for  less  then  0.1  percent  of  the  push-force  variable.  This 
demonstrated  that  differences  in  the  regression  lines  shown  in  the  figure  were  random  variance. 
This  analysis  demonstrated  that  a  single  regression  line  can  be  use  to  estimate  push  force  from  iso¬ 
metric  strength,  and  documented  that  the  gender  mean  difference  in  work  task  performance 
depended  on  strength,  not  gender. 

Physiological  Validation — Cut-Score 

Once  the  predictor  test  has  been  shown  to  be  valid,  the  next  step  of  a  physiological  validation 
strategy  is  to  define  performance  on  the  predictor  test  associated  with  the  desired  level  of  per¬ 
formance  on  the  criterion.  An  important  and  often  difficult  part  of  this  analysis  is  defining  the  crit¬ 
ical  level  of  performance  on  the  criterion  variable.  In  some  instances,  a  clear  definition  of  an  essen¬ 
tial  task  is  apparent,  for  example,  lifting  a  75-pound  industrial  valve  from  the  ground  to  the  back 
of  a  truck.  In  other  instances,  the  physiological  demands  of  a  task  can  be  difficult  to  quantify  accu¬ 
rately.  Shoveling  coal  is  a  physically  demanding  task  of  coal  miners  (Jackson  Sc  Osburn,  1983),  but 
what  level  of  intensity  and  duration  of  shoveling  are  suitable?  Firefighter  work  simulation  tests  are 
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timed  tests  that  involve  completing  several  firefighter  tasks.  Although  a  firefighter  test  may  be 
clearly  content  valid,  a  more  difficult  phase  of  the  validation  process  is  to  determine  the  time  that 
signifies  successful  fire-fighting  capacity  (Jeanneret  &  Associates,  1999). 

Regression  models  provide  valid  statistical  methods  of  estimating  physiological  capacity,  with¬ 
in  a  defined  degree  of  accuracy,  from  a  predictor  test  or  combination  of  tests.  Simple  linear  and 
nonlinear  regression  models  are  used  with  a  single  predictor  test,  and  multiple  regression  models 
are  used  with  several  predictor  tests  (Pedhauzur,  1997). This  is  a  well  established  physiological  test 
validation  method  (ACSM,  1991;  Astrand  &.Ryhming,  1954;Brozek  &Keys,  1951;  Bruce  et  al., 
1973;Durnin  8c  Wormsley,  1974;  Foster  et  al.,  1984;  Jackson,  1990;Jackson  8c  Pollock,  1978; 
Jackson  et  al.,  1980;  Pollock  et  al.,  1976).  The  following  provides  regression  examples  of  defining 
physiologicallybased  standards  with  continuously  scaled  and  pass/fail  criterion  variables. 

Continuously  Scaled  Criterion — This  first  example  shows  the  use  of  simple  linear  regression  to 
define  the  strength  needed  to  generate  the  push  force  required  by  a  task.  Thejob  analysis  (Jackson 
et  al.,  1993a)  showed  that  one  physically  demanding  job  of  freight  workers  was  pushing  or  pulling 
containers  loaded  with  freight.  As  part  of  thejob  analysis,  an  electronic  load  cell  defined  the  peak 
force  required  to  move  freight  containers  that  varied  in  weight.  The  subject’s  peak  push  force  was 
measured  with  an  isometric  push  test  that  simulated  the  position  used  to  push  containers.  Figure 
5.5  shows  the  scattergrams  with  the  male  and  female  regression  lines.  As  shown  earlier,  the  differ¬ 
ence  between  the  slopes  and  intercepts  of  the  male  and  female  regression  lines  were  within  chance 
variation  which  supports  the  fairness  of  using  a  single  regression  line  to  define  this  relationship. 
The  regression  equation  is — 

Push  Force  Regression  Equation  (R  =  0.78,  SEE  =29.0  lbs )  (1) 

Push  Force  (lbs)  =  2.031  +  (0.198  x  Strength) 

The  regression  equation  provides  a  valid  model  for  defining  the  strength  needed  to  generate  the 
push  force  needed  to  move  containers  of  the  criterion  weight.  Once  this  is  known,  the  strength 
associated  with  this  push  force  can  be  determined.  To  illustrate,  assume  the  criterion  push  forcewas 
defined  to  be  100 pounds  of  force.  The  regression  equation  shows  that  a  strength  score  of  495  esti¬ 
mates  a  push  force  of  lOOpounds. 

The  goal  of  a  physiological  model  of  validation  is  to  define  the  minimum  physiological  capacity 
demanded  by  the  work  task.  The  regression  model  provides  empirical  evidence  to  define  a  physio¬ 
logically  defined  cut-score  within  a  defined  level  of  probability.  Although  physiological  tests  scores 
typically  yield  higher  criterion-related  validity  coefficients  then  psychological  tests,  they  still  have 
substantial  prediction  errors.  Figure  5.6  shows  the  predictor  errors  associated  with  the  push  force 
task.  Provided  is  an  Altman-Bland  plot  (Altman  8cB  laud,  1983;  Altman  8cBlaud,  1986)  of  the  push 
force  data  estimated  from  isometric  strength  (see  Figure  5.5).  The  Altman-Bland  method  plots  the 
difference  between  the  residual  scores  (Y  -  Y’  which  is  measured  estimated  push  force)  by  the  aver¬ 
age  of  measured  and  estimated  push  force.  Although  the  correlation  between  the  criterion,  push 
force,  and  predictor,  isometric  strength,  was  high,  0.78,  the  Altman-Bland  plot  shows  that  defining 
the  physiological  criterion  is  not  error  free.  The  variability  on  the  Y  axis  is  defined  by  the  standard 
error  of  estimate  of  the  regression  analysis,  which,  in  this  example,  is  29  pounds  of  push  force. 
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Figure  5.6  Reprinted,  by  permission,  from  Altman,  D.  G.,  BlandmJ.  M.  (Altman-Bland plot  of  prediction 
residuals  (measured  -  estimated)  contrasted  by  the  average  of  measured  and  estimated  maximum  push 
force),  pp.  307-310,  ©  by  the  Lancet  Ltd.,  1986. 

Because  the  correlation  between  a  predictor  variable  and  the  criterion  test  is  always  less  than  1, 
there  will  always  be  prediction  errors.  The  standard  error  of  estimate  provides  an  estimate  of  the 
variation  in  prediction  error.  Although  it  is  not  possible  to  define  an  exact  physiologically-based 
cut-score,  it  is  possible  to  define  a  standard  with  a  defined  degree  of  probability.  The  regression 
equation  (Equation  1)  provides  a  valid  model  that  defines  the  relationship  of  strength  with  push 
force.  As  shown  earlier, 495  pounds  is  associated  with  apush  force  of  100  pounds.  Because  the  cor¬ 
relation  between  the  two  tests  is  less  than  perfect  and  there  are  prediction  errors,  only  50  percent 
of  subjects  with  495  pounds  of  strength  would  be  expected  to  have  the  capacity  to  generate  100 
pounds  of  push  force.  The  regression  model's  standard  error  of  estimate  can  be  used  to  define  the 
probability  that  someone,  with  a  given  level  of  strength,  would  meet  the  physiologicallybased  stan¬ 
dard.  Figure  5.7  shows  the  relationship  between  level  of  isometric  strength  and  probability1  of  being 
able  to  generate  lOOpounds  of  push  force.  The  probability  estimates  provide  additional  data  that 
can  be  used  to  define  a  physiological  criterion  that  is  congruent  with  the  criticality  of  the  task,  and 
the  mission  and  unique  organizational  characteristics. 

Pass-FailModel  —  Often,  the  criterion  of  job  performance  is  scaled  as  a  dichotomous  variable.  For 
example,  manual  lifting  tasks  are  scored  pass  or  fail — the  applicant  could  or  could  not  lift  a  given 
weight  load  (Jackson,  Osburn,  Loughery  8c  Sekula,  1998;Jackson  et  al.,  1992).  Other  examples  are 
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SUM  OF  ISOMETRIC  STRENGTH  (pounds) 

Figure  5.7  Probability  of  being  able  to  generate  100  pounds  for  push  force  for  levels  of  strength 

endurance  tasks  at  a  constant  power  output.  A  manufacturing  work  task  may  require  a  worker  to 
repetitively  lift  and  transport  weight  loads  at  a  given  work  rate  governed  by  production  speed. 
Individuals  without  sufficient  physiological  capacity  would  not  be  able  to  maintain  the  set  pace.  A 
task  documented  that  refinery  workers  must  close  industrial  valves  during  emergencies  (  Jackson, 
1987;Jackson  et  al.,  1992;  Osburn,  1977).  For  some  individuals,  the  task  exceeded  their  physiolog¬ 
ical  capacity  and  they  fatigue  quickly.  For  others,  the  task  was  within  their  physiological  capacity. 
These  fit  individuals  could  continue  work  for  extended  periods  of  time.  Demanding  repetitive  tasks 
at  a  set  power  output  tend  to  produce  a  bimodal  distribution — those  who  have  and  those  who  do 
not  have  the  physiological  capacity.This  is  illustrated  in  the  literature  (Jackson  et  al.,  1992). 

Logistic  regression  analysis  (Hosmer  ScLemeshow,  1989;Pedhauzur,  1997) provides  a  model 
to  physiologically  validate  tests  when  the  criterion  is  a  dichotomous  variable.  Logistic  regression, 
like  multiple  regression,  can  use  a  single  independent  variable  or  several  independent  variables.  A 
logistic  regression  model  estimates  the  probability  of  group  membership  (e.g.,  criterion  variable  of 
pass  or  fail)  given  a  score  or  scores  on  the  predictor  variable  (Pedhauzur,  1997).  A  public  health 
landmark  multiple  logistic  regression  validation  study  was  with  the  Framingham  heart  study 
(Kannel,  McGee,  &Gordon,  1976).  The  research  objective  was  to  identify  and  quantify  cardiovas¬ 
cular  disease  risk  factors.  The  logistic  analysis  not  only  established  that  cholesterol, blood  pressure, 
glucose  intolerance,  and  smoking  were  independent  cardiovascular  disease  (CVD)  risk  factors,  the 
statistical  analysis  also  produced  an  equation  with  a  function  of  estimating  the  probability  of  CVD 
risk  for  combinations  of  risk  factors.  Logistic  regression  analysis,  like  regression  models  with  con¬ 
tinuous  variables,  establishes  the  validity  of  the  independent  variable(s)  and  provides  an  empirical 
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model  for  defining  the  probability  of  group  membership.  The  application  of  simple  logistic  regres¬ 
sion  analysis  is  illustrated  below  with  a  lifting  task. 

A  task  analysis  of  an  oil  production  plant  showed  that  lifting  heavy  valves  from  the  floor  to 
knuckle  height  was  an  important,  physically  demanding  work  task  (Jackson,  1998).  A  work-sam¬ 
ple  test  was  developed  to  simulate  the  task.  The  work-sample  test  involved  lifting  several  loads  that 
varied  in  weight.  The  physical  dimensions  of  the  lift  duplicated  the  work  task.  The  test  was  scored 
pass  or  fail  depending  on  the  subject’s  ability  to  complete  the  lift.  The  predictor  test  was  the  sum 
of  four  isometric  strength  tests,  arm,  shoulder,  torso,  and  leg  strength.  The  goal  of  this  physiologi¬ 
cal  validation  was  to  define  the  level  of  strength  required  for  the  lift  task. 

This  validation  method  is  illustrated  with  three  weight  loads,  60-,  90-,  and  120-pound  lifts. 
These  weights  represent  industrial  lifts  ranging  from  moderately  heavy  to  very  difficult.  The  first 
step  in  this  analysis  was  to  determine  whether  lift  success  depended  on  strength.  Table  5.4provides 
the  means,  standard  deviations,  and  sample  sizes  of  the  subjects  who  passed  and  failed  the  lift. 
Analysis  of  variance  showed  that  lift  success  depended  on  strength  and  documented  three,  expect¬ 
ed  trends.  First,  the  number  of  individuals  who  could  lift  the  load  decreased  with  the  weight  load. 
Next,  the  Analysis  of  Variance  (ANOVA)  documented  that  lift  success  for  all  three  weights 
depended  on  isometric  strength.  The  means  for  those  who  lifted  the  weight  were  significantly 
higher  than  for  those  who  could  not.  Third,  the  mean  strength  of  those  who  completed  the  lift 
increased  with  the  weight  load.  These  trends  are  consistent  with  physiological  expectations. 


Table  5.4  Sample  sizes,  strength  means  and  standard  deviations,  and  analysis  of  strength  differences  of 
those  who  could  and  could  not  lift  the  weight 


Lift  Weight 

Lifted  Weight 

Did  Not  Lift  Weight 

ANOVA 

F-ratio 

N 

M  ±SD 

N 

M±SD 

60-Pound 

120 

518  ±  197 

16 

196  ±66 

41.92’ 

90-Pound 

93 

579  ±175 

43 

233 ±  101 

118.89’ 

120-Pound 

71 

644 ± 141 

65 

301  ±108 

250.69* 

Figure  5.8  provides  a  scatter  plot  of  the  subjects’  strength  data  contrasted  with  their  90-pound 
lift  success. This  plot  shows  the  group  difference  in  strength  documented  by  the  ANOVA  but  also 
shows  an  overlap  in  the  strength  of  those  who  passed  and  failed  the  lift.  Logistic  regression  analy¬ 
sis  provides  a  model  for  estimating  the  probability  of  success  on  the  criterion  variable  (i.e.,  lifting 
the  load)  for  given  levels  on  the  predictor  test  (i.e.,  strength)  or,  in  this  example,  the  probability  of 
being  able  to  lift  the  load  for  a  level  of  strength.  The  logistic  regression  analysis,  which  agreed  with 
the  ANOVAs  (Table  5.1),  showed  that  the  regression  weight  for  strength  was  significantly  related 
to  the  probability  of  lifting  the  given  weight.  The  equations  for  the  three  lift  loads  are — 

60-pound  lift  (2) 

Logit(P)  =  (0.020  x  Strength )  -  3.926 
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90-pound  lift 

Logit(P)  =  (0.017x  Strength)  -5.689 


(3) 


120-pound  lift  (4) 

Logit(P)  =  (0.023  x  Strength)  -  10.334 


ISOMETRIC  STRENGTH  (lbs) 

Figure  5.8  Scatterplot  of  strength  test  of  subjects  who  could  or  could  not  complete  a  90-pound  lift  from  floor 
to  knuckle  height 


Once  the  logistic  equation  is  defined.  Equation  5  estimates  the  probability  of  success  (Pedhauzur, 
1997). The  term  e  in  Equation  5  is  the  base  of  the  natural  logarithm;  a  value  of  Y  2.718.  Figure  5.9 
graphically  shows  the  probability  of  success  in  completing  the  lift  for  strength  levels. 

Logistic  Probability  Calculation  Model  (5) 

P  —  )  x  100 

The  logistic  probability  curves  clearly  show,  as  would  be  physiologically  expected,  that  the 
strength  needed  to  lift  the  load  increases  as  the  lift  gets  heavier.  There  is  a  50  percent  probability, 
for  example,  that  someone  with  200  pounds  of  strength  could  lift  a  60-pound  load.  In  contrast,  only 
10  percent  of  the  subjects  with  200  pounds  of  strength  would  be  expected  to  lift  90  pounds.  The 
likelihood  of  someone  with  200  pounds  of  strength  lifting  120pounds  is  O.The  physiological  lev¬ 
els  needed  to  be  50  percent  confident  of  lifting  the  90-  and  120-pound  loads  are  about  350  and  450 
pounds  of  strength,  respectively. 
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SUM  CF  ISOMETRIC  STRENGTH  (pounds) 

Figure  5.9  Logistic  curves  of  the  probability  of  being  able  to  lift  the  weight  load  as  a  function  of  lifter  strength 

Physiological  Validation — Matching  the  Worker  to  the  Job 


The  goal  of  physiological  test  validation  is  to  select  workers  with  the  capacity  to  meet  the 
demands  of  the  job.  This  is  consistent  with  ergonomic  objectives  designed  to  reduce  the  risk  of  job- 
related  injuries  (Ayoub,  1982).  As  has  been  shown  in  this  chapter,  the  statistical  models  used  to 
define  the  physiological  stress  of  the  task  are  less  then  absolute.  This  permits  latitude  in  formulat¬ 
ing  physical  cut-scores  ranging  from  lenient  to  rigorous.  The  regression  statistics,  equations,  and 
standard  errors  provide  an  empirical  base  for  making  the  decision. 

Although  the  regression  models  previously  discussed  can  help  define  the  degree  of  physiologi¬ 
cal  stress,  the  difficult  task  of  establishing  a  suitable  cut-score  for  a  criterion  remains.  The  types  of 
job  performance  criteria  listed  in  the  Uniform  Guidelines  that  may  be  suitable  are  supervisory  rat¬ 
ings,  production  rate,  error  rate,  tardiness,  absenteeism,  and  success  in  training.  According  to  the 
Guidelines,  this  is  not  an  inclusive  list  of  criteria.  Other  examples  of  criteria  used  to  validate  phys¬ 
ical  tests  include  accidents  (Reilly  et  al.,  1979),  field  performance  (Reilly  et  al.,  1979),  injury  rates 
(Gilliam  &  Lund,  2000;  Keyserling  et  al.,  1980;  Keyserling  et  al.,  1980);  lost  time  due  to  sickness 
or  injury  (Rayson  et  al. ,2000a;  Rayson  et  al.,  2000b);  and  job-related  work  tasks  (Arnold  et  ah, 
1982;  Jackson,  Osburn,  &c  Laughery,  1998;  Jackson,  Osburn,  &  Laughery  1984;Jackson  et  ah, 
1992;Jackson,  Osburn,  6c  Laughery,  1991;Jackson,  Zhang,  Laughery,  Osburn,  &Young,  1993b; 
Rayson,  2000a;  Rayson,  2000b). 


158 


Chapter  5:  Physical  Test  Evaluation  for  Job  Selection 


A  crucial  element  of  any  evaluation  strategy  is  the  selection  rate  of  a  protected  group,  which,  in 
physical  testing,  is  females.  The  physiological  validation  method  supplements  the  process  of  defining 
an  appropriate  cut-score  approach  with  scientific  evidence.This  validation  approach  seeks  to  find  the 
minimum  physiological  level  demanded  by  the  task.  The  Uniform  Guidelines  (EEOC,  1978)  allow 
the  use  of  rationaljudgment  in  setting  a  valid  cut-score.  An  objective  of  the  physiological  validation 
process  is  to  provide  a  scientific  explanation  of  the  validation  results.  Included  in  this  process  is  the 
establishment  of  a  sound  cut-score.  Receiver  operator  characteristic  (ROC)  analysis  (Hulley,  1 988)  is 
one  method  used  to  establish  physiological  cut-scores.  It  supplements  the  regression  results  by  defin¬ 
ing  a  cut-score  consistent  with  a  strategy  of  maximizing  either  test  sensitivity  or  specificity. 

A  ROC  is  a  graphic  analysis  used  to  establish  a  trade-off  between  test  sensitivity  and  specifici¬ 
ty.  If  the  goal  is  to  maximize  test  sensitivity,  the  proportion  of  true  positives  (i.e.,  those  who  can 
meet  the  physiological  demands  of  the  work),  the  ROC  would  be  a  plot  of  test  sensitivity  by 
1  —  specificity,  which  is  the  proportion  of  false  positives.  False  positives  arc  those  identified  by 
the  test  with  the  physiological  capacity  to  meet  the  demands  of  the  task  but  who  cannot  meet  the 
demands.  In  this  context,  the  ROC  curve  provides  a  rational  method  of  selecting  a  cut-score  based 
on  a  balance  between  high  sensitivity  and  low  specificity.  The  interested  reader  is  directed  to  anoth¬ 
er  source  (Wellens  et  al.,  1996)  for  the  application  of  ROC  analysis  for  establishing  a  physiologi¬ 
cal  cut-score.  The  objective  of  that  study  was  to  find  the  body  mass  index  (ratio  of  weight  and 
height)  that  defined  the  obesity  levels  of  25  percent  and  33  percent  body  fat  content,  determined 
hydrostatically,  for  men  and  women,  respectively. 

Several  factors  arc  considered  when  establishing  physiologically  based  cut-scores.  The  following  is  a 
nonexhaustive  list  of  conditions  that  may  determine  whether  a  lenient  or  rigorous  cut-score  is  selected — 

•  Adverse  Impact — The  first  concern  is  adverse  impact.  Consideration  must  be  given  to  the 
number  of  the  protected  group  that  the  standard  screens  out. 

•  Risk  of  Injury  —  Subjecting  workers  to  physical  demands  increases  the  risk  for  work-related 
injuries.  Numerous  studies  (Cady,  Bishoff,  O’Connell,  Thomas,  &•  Allan,  1979;  Gilliam  8t 
Lund,  2000;  Herrin,  1986;Keyserlinget  al.,  1980;  Liles  et  al.,  1984:  Snook,  Cam  panel  li,  & 
Hart,  1978;  Snook  &  Ciriello,  1991)  show  that  the  risk  of  musculoskeletal  injury  increases 
as  the  demands  of  the  task  approach  the  worker's  maximum  physiological  capacity. 

•  Physiological  Interpretation  of  the  Validation  Results — An  important  element  of  a  physi¬ 
cal  test  validation  study  is  to  establish  the  congruence  among  the  validation  results,  pub¬ 
lished  research,  and  physiological  theory.  It  is  critical  to  provide  a  sound  physiological  expla¬ 
nation  of  the  validation  results.  Failure  to  be  able  to  interpret  the  results  by  accepted  aca¬ 
demic  standards  leaves  the  decision  open  to  question. 

•  Environmental  Conditions  —  Often,  the  location  at  which  the  validation  study  is  conduct¬ 
ed  will  be  different  from  the  work  environment.  For  example,  firefighter  tests  are  not 
administered  in  burning  buildings,  the  source  of  demanding  work.  Environmental  condi¬ 
tions  (e.g.,  heat)  that  increase  the  demands  of  the  task  justify  more  rigorous  standards. 

•  Workforce  Numbers  —  The  number  of  workers  available  at  the  work  site  can  affect  the  rigor 
of  a  cut-score.  A  more  lenient  standard  might  be  considered  when  several  workers  are  avail¬ 
able  to  do  the  work.  Although  a  lenient  selection  standard  would  increase  the  probability 
that  a  worker  cannot  meet  the  most  physical  demands  of  the  job  (i.e.,  a  false  positive),  it  may 


Human  Systems  IAC  SOAR,  2000 


159 


not  be  a  serious  problem  if  others  are  available  to  do  the  work.  The  stronger  workers  can 
help  with  the  most  demanding  tasks.  In  contrast,  a  more  rigorous  standard  might  be  con¬ 
sidered  if  a  worker  does  not  have  help. 

•  Criticality  oftheJob — In  somejobs,  the  failure  to  meet  the  demands  of  ajob  can  be  dan¬ 
gerous.  The  dummy  drag  test  is  a  common  item  of  a  preemployment  firefighter  test.  This  is 
a  critical  task  because  the  inability  to  perform  it  successfully  can  be  life  threatening. 

*  Workforce  Productivity — Selecting  workers  with  a  higher  physiological  capacity  can 
increase  an  organization’s  productivity.  The  data  in  Figure  5.3  show  that  the  amount  of 
freight  a  worker  was  capable  of  moving  was  related  to  the  worker’s  strength  capacity.  This  was 
one  of  the  factors  consideredby  a  freight  company  to  initiate  a  preemployment  test  program? 


Published  alidation  Studies 


Although  many  preemployment  tests  have  been  completed,  most  are  not  in  the  published  lit¬ 
erature.  The  completed  validation  study  often  is  a  technical  report  to  the  governmental  agency  or 
private  company  that  funded  the  project,  and  many  organizations  consider  these  privileged.  Hogan 
(1991b)  provides  an  extensivelist  of  these  unpublished  reports.  The  following  sections  summarize 
the  published  validation  research.3 


Outside  Craft  Jobs 

One  of  the  first  published  concurrent  validation  studies  was  for  outdoor  telephone  craft  jobs 
that  involvedpole-climbing  tasks  (Bernauer  ScBonanno,  1975;  Reilly  et  al.,  1979). The  issues  lead¬ 
ing  to  the  development  of  this  study  were  the  large  differences  between  male  and  female  workers 
in  turnover  and  accident  rates.  After  6  months,  43  percent  of  the  women  left  the  outdoor  craft  jobs 
compared  with  only  8  percent  of  the  males.  More  important,  women  sustained  substantially  more 
injuries  than  men  from  falls  while  climbing  or  working  on  poles. 

An  extensivejob  analysis  showed  that  pole  climbing  was  an  essential,  physically  demanding 
work  task.  Bernauer  and  Bonanno  (1975)  evaluated  the  factor  composition  of  40  tests  and  anthro¬ 
pometric  measures  on  a  sample  of241job  applicants.  They  developed  a  six-item  battery  consisting 
of  reaction  time,  grip  strength,  percent  body  fat,  step  test  performance,  balance,  and  sit-ups.  They 
found  that  the  balance  and  step  tests  significantly  differentiated  successful  from  unsuccessful  stu¬ 
dents  enrolled  in  pole-climbing  school. 

Reilly  and  associates  (Reilly  et  al.,  1979)  extended  this  work  by  completing  two  concurrent  val¬ 
idation  studies.  In  the  first  experiment,  several  anthropometric  and  physical  performance  tests  were 
administered  to  83  male  and  45  female  candidates  for  outdoor  telephone  craftjobs.  Two  validation 
criteria  were  used  in  this  experiment.  The  first,  general  task  performance,  was  the  average  of  two 
supervisor  performance  ratings  of  the  candidate’s  performance  during  the  5-day  pole-climbing 
school.  Job  analysis  data  were  used  to  construct  the  rating  scale.  The  second  criterion  was  a 
dichotomy  of  those  who  were  on  the  job  6  months  after  placement  and  those  who  were  not.  Using 
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the  criterion  of  general  task  performance,  stepwise  multiple  regression  isolated  a  three-predictor 
battery  consisting  of  dynamic  arm  strength,  reaction  time,  and  Harvard  bench  step  time.  The 
analysis  yielded  a  multiple  correlation  of  0.45.  The  statistically  significant  zero-order  correlations 
between  the  job  tenure  criterion  and  these  tests  were  dynamic  arm  strength,  0.36;  reaction  time, 
0.19;  and  bench  step  time,  0.18.  Further  analysis  showed  that  a  common  regression  line  defined 
male  and  female  performance  that  met  the  important  criteria  of  job  fairness. 

The  second  experiment  used  a  larger  sample  of  employees  who  represented  the  whole  compa¬ 
ny.  The  criterion  of  pole-climbing  training  success  was  changed  to  be  consistent  with  changes 
introduced  in  the  pole-climbing  course.  The  second  study  included  four  different  criterion  meas¬ 
ures  of  job  performance — 

1.  time  to  complete  the  pole-climbing  school, 

2.  completion  of  pole-climbing  school  (a  number  withdrew  from  the  course), 

3.  field  observations  of  pole-climbing  proficiency,  and 

4.  accidents  for  6  months  after  entering  outdoor  craft  work. 

The  second  sample  consisted  of  78  female  and  132  male  pole-climbing  school  applicants. 

Multiple  regression  selected  a  three-item  battery  consisting  of  body  density  estimated  from 
skinfold  fat,  balance,  and  an  isometric  arm  strength  test.  The  criterion  was  time  to  complete  the 
course.The  significant  correlations  among  the  three  tests  and  the  four  criteria  were  time  to  com¬ 
plete  the  course,  0.46;  training  dropout,  0.38;  field  observations  for  the  female  sample,  0.53;  and 
accidents,  0.15.  Further  analysis  showed  that  the  same  regression  equation  was  equally  valid  for 
both  males  and  females. 

Firefighters 

Nearly  all  major  fire  departments  have  a  physical  ability  preemployment  test  (Landy  Sc 
Investigator,  1992).Considine  and  associates  (Considine  et  al.,  1976)  published  the  first  physical 
test  battery  for  screening  firefighter  applicants.  The  test  battery  evolved  from  an  occupational  task 
analysis  that  surveyed,  rated,  and  analyzed  81  tasks  performed  by  firefighters.  The  authors  select¬ 
ed  a  construct  validation  strategy. The  constructs  identified  through  the  task  analysis  were  dynam¬ 
ic  strength,  static  strength,  agility,  total  body  coordination,  cardiorespiratory  endurance,  muscular 
endurance,  eye-hand  coordination,  and  total  body  speed. 

The  sample  of  the  first  study  consisted  of  191  males  who  were  tested  on  body  composition 
measures,  general  physical  performance  tests,  and  eight  job  sample  tests.  A  factor  analysis  of  these 
data  produced  three  general  factors.  The  factor  names  and  tests  representing  each  factor  were  fac¬ 
tor  l,the  ability  to  handle  the  body  weight  measured  by  percent  body  fat,  obstacle  run,  and  flexed- 
arm  hang;  factor  2,  muscle  power  measured  by  the  hose  lift,  man-lift-and-carry,  and  stair  climb 
work  sample  tests;  and  factor  3,  body  structure  measured  by  fat-free  weight  and  height. 

A  major  purpose  of  the  second  study  was  to  analyze  the  test  battery  for  racial  bias.  Based  on  the 
results  of  the  first  study,  nine  tests  were  administered  to  165  firefighters  and  19  candidates.  Data 
analysis  showed  that  African-American  and  white  subjects  did  not  differ  on  any  of  the  tests.  These 
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data  were  factor  analyzed  producing  three  common  factors.  The  final  recommended  battery  con¬ 
sisted  of  four  work  sample  tests,  and  one  fitness  test;  the  flexed-arm  hang.  The  work  sample  tests 
were  modified  man-lift-and-carry  that  simulated  rescuing  a  trapped  victim;  stair  climb  that  simu¬ 
lated  climbing  the  stairs  in  a  building;  obstacle  run  that  simulated  moving  the  body  through  con¬ 
fined  spaces;  and  hose  couple  that  involved  coupling  three  hoses  to  a  hose  couple. 

Davis  and  associates  (Davis,  Dotson,  6c  SantaMaria,  1982)  examined  the  relationship  between 
simulated  firefighting  tasks  and  physical  performance  measures.  The  sample  consisted  of  100  ran¬ 
domly  selected  men  from  the  population  ofWashington,  DC,  firefighters.The  physical  performance 
measures  included  body  composition,  general  fitness,  aerobic  fitness,  and  cardiovascularvariables.The 
five  work-sample  tests  came  from  the  job  analysis  of  firefighter  work  tasks  and  involved  handling  a 
ladder,  lifting  and  transporting  a  33.1-kilogramload  up  five  flightsof  stairs,  pulling  a  23.5-kilogram 
hose  roll  from  the  ground  up  to  and  through  the  fifth-floorwindow,  carrying  and  dragging  a  53-kilo¬ 
gram  dummy  down  five  flights  of  stairs,  and  using  a  sledge  hammer  to  simulate  forceful  entry. 

Canonical  correlation  showed  that  two,  independent  dimensions  defined  the  relationship 
between  the  physical  performance  variables  and  firefighter  work-sample  tests.  The  first  canonical 
dimension  (Rc  =  0.79)  represented  a  physical  work  capacity  factor  that  reflected  the  muscular 
strength  and  endurance,  and  maximal  aerobic  capacity  elements  of  the  simulated  work-sample 
tests.  The  second  dimension  (Rc  =  0.63)  represented  a  resistance  to  fatigue  factor  and  the  ability  to 
complete  the  work  tasks  quickly.  Multiple  regression  selected  two  physical  performance  batteries 
(laboratory  and  field  batteries)  to  estimate  each  work-sample  dimension.  The  field  test  battery  for 
the  physical  work  capacity  factor  consisted  of  push-ups,  sit-ups,  and  grip  strength.  The  validity  of 
the  field  battery  (R  =  0.73)  was  lower  than  the  five-item  laboratory  battery  (R  =  0.95)  that  added 
submaximal  oxygen  pulse  and  maximum  heart  rate  to  the  battery.  The  three-item  field  test  of  the 
second  factor  included  estimated  percent  body  fat,  lean  body  weight,  and  VC^max  estimated  with 
a  step  test  (R  =  0.77).  The  laboratory  test  added  maximum  heart  rate  and  treadmill  performance 
and  increased  the  validity  (R  =  0.89)  of  the  resistance  to  fatigue  work  sample  factor. 

The  physiological  response  of  fire  fighting  has  been  the  focus  of  many  investigators.  Exercise 
heart  rate  responses  elicited  by  simulated  and  actual  firefighting  tasks  confirmed  that  these  tasks 
have  a  significant  cardiovascular  effect  (Barnard  6c  Duncan,  1975;  Davis  6c  Convertino,  1975; 
Lemon  6cHermiston,  1977;Manning  6c Griggs,  1983; O'Connell, Thomas,  Caddy, 6c Karwasky, 
1986;  Sothmann,  Saupe.Jasenor,  6cBlaney,  1992).  In  a  study  during  actual  fire-suppression  emer¬ 
gencies,  Sothmann  and  associates  (Sothmann  et  ah,  1992)  measured  exercise  heart  rate  and  oxygen 
uptake  on  1 0  male  fire  fighters.  Their  data  showed  that  firefighters  worked  at  an  average  of  88  per¬ 
cent  (  6%)  of  their  measured  maximum  heart  rate  for  an  average  duration  of  15  (±7)  minutes.  The 
average  energy  cost  of  the  firefighter  emergency  work  task  was  a  VO2  of  25.6  >8.7  ml/kg/ min, 
representing  an  intensity  of  63  percent  (±  14%)  of  V02max. 

Sothmann  and  associates  (Sothmann  et  ah,  1990)  examined  the  relationship  between  V02max 
and  firefighting  work  tasks.  A  seven-item,  content-valid  fire  suppression  test  was  administered  to 
20  experienced  fire  fighters.  The  average  energy  cost  of  the  firefighter  simulation  tests  was  30.5  (± 
5.6)  ml/kg/min.  The  work  simulation  required  the  firefighters  to  work  at  an  intensity  of  76  percent 
(±  8)  ofV02max.  The  correlation  between  the  elapsed  time  required  to  complete  the  firefighter 
work  simulation  test  and  measured  VC^max  was  -0.55,  In  a  cross-validation  study  with  32  differ¬ 
ent  male  firefighters,  successful  work  simulation  performance  depended  on  VC^max.  Of  the  32 
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tested,  seven  firefighters  could  not  complete  the  work  sample  tests.  The  VC^max  of  five  of  the 
seven  was  below  33.5  ml/kg/min. 


Highway  Patrol  Officers 

With  an  increasing  number  of  women  seeking  employment  as  highway  patrol  officers,  the 
objective  of  the  study  published  by  Wilmore  and  Davis  (1979)  was  to  find  the  minimum  physical 
qualifications  and  develop  ajob-related  preemployment  test.  They  administered  three  different  bat¬ 
teries  of  tests  to  140  male  and  16  female  patrol  officers.  The  laboratory  and  field  test  batteries 
included  strength,  flexibility,  body  composition,  and  cardiorespiratory  endurance  items.  The  job 
sample  tests  included  a  barrier  surmount  and  arrest  simulation,  and  a  dummy  drag  that  simulated 
dragging  an  injured  victim  50  feet  to  safety. 

The  major  differences  between  the  field  and  laboratory  batteries  were  that  the  1.5  mile  run 
replaced  the  maximum  treadmill  test,  and  body  fat  was  estimated  from  skinfolds  rather  then  meas¬ 
ured  by  hydrostatic  weighing.  The  laboratory  test  battery  was  significantly  correlated  with  the 
dummy  drag  (R=0.66)  and  barrier  surmount  and  arrest  simulation  tests  ( R=  0.68).  Replacing  the 
laboratory  tests  with  the  field  tests  resulted  in  slightly  lower  correlations,  0.57  for  the  dummy  drag, 
and  0.62  for  the  barrier  surmount  and  arrest  simulation  tests.  Although  the  fitness  tests  estimated 
work  simulation  test  performance,  test  performance  was  not  related  to  job  performance  consisting 
of  supervisor  ratings  on  16  criticaljob  tasks. 

The  data  analysis  showed  that  the  officers  were  similar  to  the  normal  population  in  strength,  body 
fat,  flexibility,  and  cardiorespiratory  endurance.  An  important  result  of  the  study  was  that  the  pre¬ 
dominantly  sedentary  nature  of  the  officer  sjob  led  to  a  rapid  deterioration  in  physical  fitness  follow¬ 
ing  his  or  her  academic  training,  suggesting  the  need  for  an  in-service  physical  conditioning  program. 


Steel  Workers 

Arnold  and  associates  (Arnold  et  al.,  1982)  developed  a  preemployment  test  for  selecting  entry- 
level  steel  workers.  The  task  analysis  documented  that  entry-level  steel  workers  must  do  several  dif¬ 
ferent  physically  demanding  tasks.  The  investigators  used  a  combination  of  content-and  construct- 
validation  strategies.  The  job  analysis  identified  the  physically  demanding  work  tasks  required  of 
the  entry-level  workers  and  categorized  them  by  Fleishman’s  constructs  of  static  strength,  dynam¬ 
ic  strength,  and  endurance  (Fleishman,  1964).  The  selected  candidate  physical  performance  tests 
were  those  that  theoretically  measured  these  constructs. 

The  objective  of  the  study  was  to  determine  whether  the  physicalperformance  tests  were  related  to 
the  work-sample  tests  developed  from  the  job  analysis. The  sample  included  168  men  and  8 1  women 
who  were  in  their  first  6  months  of  employment  at  three  different  plant  locations.  The  job  analysis 
showed  that  work  tasks  differed  somewhat  across  the  3  sites,  resulting  in  llwork  sample  tests  at  1  site 
and  12  at  the  other  2  sites.  The  average  work-sample  test  performance  was  the  criterion  of  work  per¬ 
formance.  In  addition  to  the  work-sample  tests,  each  subject  completed  10  physical  performance  tests 
sampling  strength,  flexibility,  agility,  balance,  and  cardiorespiratoryendurance  dimensions. 
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Multiple  regression  selected  the  physical  performance  tests  most  highly  correlated  with  the 
work-sample  criterion.  For  all  three  work  sites,  arm  dynamometer  strength  was  the  most  important 
predictor  of  work-sample  test  performance.  The  zero-order  correlations  between  arm  strength  and 
work-sample  test  performance  were  consistently  high — 0.82,  0.85,  and  0.85  for  the  three  sites. 
Adding  two  more  tests  to  the  multiple  regression  models  added  little  to  the  validity;  the  multiple 
correlations  for  the  three  predictor  models  increased  to  0.87,  0.88,  and  0.89. 

The  authors  completed  a  utility  analysis  for  the  single  arm  strength  test  (Hunter,  Schmidt,  &. 
Hunter,  1979).  This  analysis  involved  estimating  the  money  the  company  would  save  by  hiring 
workers  who  could  do  the  work.  Utility  estimates  were  based  on  test  validity  and  the  monetary 
value  was  related  to  the  variability  of  work  performance.  Using  1982  wage  standards,  Arnold  and 
associates  estimated  that  using  the  single  arm  strength  test  to  select  employees  would  lead  to  a  sav¬ 
ings  of  about  $5,000 per  year  for  each  employee  selected.  Based  on  employees  hired,  the  estimated 
company  savings  were  more  than  $9  million  a  year. 

Undergroun  C  a!  Mining 

A  job  analysis  showed  that  the  work  of  underground  coal  miners  was  physically  demanding  and 
that  the  work  could  be  represented  with  four  work  sample  tests  (Jackson  &.  Osburn,  1983;Jackson 
et  al.,  1991).  The  first  work-sample  simulation  test,  roof  bolting,  measured  maximum  isokinetic 
torque  and  simulated  straightening  a  steel  roof  bolt.  The  block  carry  test  involved  lifting,  trans¬ 
porting,  and  placing  82-pound  concrete  blocks  in  positions  commonly  used  to  build  retaining  walls 
in  the  mine.  The  shoveling  simulation  test  involved  shoveling  poly  vinyl  chloride  from  the  floor  over 
a  3.5-foot  wall.  Polyvinyl  chloride  has  the  same  density  of  coal,  and  the  task  was  to  shovel  800 
pounds  at  a  rate  consistent  with  the  subject’s  fitness.  The  bag  carry  simulation  test  measured  the 
number  of  50-pound  bags  that  were  lifted  and  transported  9  feet  during  a  5 -minute  period. 

The  four  work-sample  tests  and  three  isometric  strength  tests  (grip,  arm  lift,  and  torso  lift) 
(NIOSH,  1977)  were  administered  to  25  male  and  25  female  subjects. The  validation  strategy  was 
similar  to  that  followed  by  Arnold  and  associates  with  steelworkers  (Arnold  et  ah,  1982). The  cor¬ 
relations  between  the  sum  of  the  isometric  strength  tests  and  four  work-sample  tests  ranged  from 
0.68  for  the  bag  carry  test  to  0.91  for  the  roof  bolting  test.  Multiple  regression  analysis  showed  that 
neither  gender  nor  the  gender-by-isometric  strength  interaction  accounted  for  the  additional  sig¬ 
nificant  variance.  This  showed  that  a  common  male  and  female  regression  line  defined  the  rela¬ 
tionship  between  strength  and  work-sample  test  performance. 

Both  exercise  heart  rate  and  rating  of  perceived  exertion  data  showed  that  the  shoveling  and  bag 
carry  tests  had  significant  aerobic  components  (Jackson  et  al.,  1991).  In  addition  to  the  isometric 
strength  tests,  the  subject’s  maximal  arm  cranking  oxygen  uptake  was  metabolically  determined.  The 
zero-order  correlations  between  the  sum  of  isometric  strength  and  the  work-sample  shoveling  and 
bag  carry  tests  were  higher  than  the  correlations  found  with  arm  VC^max  (ml/min).  The  strength 
correlations  were  0.71  for  shoveling  and  0.63  for  the  bag  carry  test,  compared  with  0.68  and  0.46  for 
arm  VC^max  (ml/min).  Multiple  regression  analysis  showed  that  arm  VC^max  accounted  for  an 
additional  9  percent  of  shoveling  variance  beyond  that  of  isometric  strength  but  did  not  account  for 
additional  bag  carry  variance.  Polynomial  regression  analysis  showed  that  the  relationship  between 
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these  two  endurance  work-sample  tests  and  isometric  strength  was  quadratic,  not  linear.  Strength 
was  more  important  for  differentiating  among  work  sample  performance  at  the  lowest  levels. 


Chemical  Plan!  Workers 


Job  analyses  documented  that  the  physically  demanding  tasks  required  of  chemical  and  refining 
plants  workers  included  cracking,  opening,  and  closing  valves  (Jackson,  Osburn,  Laughery,  8t 
Vaubel,  1990;  Osburn,  1977).  Osburn  (1977)  developed  a  valve-tuming  work-simulation  test 
administered  on  a  specially  developed  ergometer  consisting  of  a  disc  brake  mechanism  turned  by  a 
12-inch  value  handwheel.  The  unit  was  calibrated  to  a  power  output  of  1,413.5  foot-pounds/minute. 
The  objective  of  the  work-sample  test  was  to  complete  250  revolutions  in  15  minutes.  The  job  analy¬ 
sis  showed  this  level  of  work  would  open  or  close  75  percent  of  the  emergency  valves  in  15  minutes. 

The  distribution  of  the  valve-tuming  test  was  bimodal.  Physically  fit  workers  easily  completed 
the  15-minute  test,  but  the  test  was  too  demanding  for  many,  who  stopped  before  reaching  50  rev¬ 
olutions  (Jackson  et  al.,  1990).  The  test  elicited  maximal  cardiovascularresponses  in  many  appli¬ 
cants  (Osburn,  1977).  This  result  led  to  a  second  study  designed  to  determine  whether  isometric 
strength  tests  validly  predicted  valve-tuming  performance  (Jackson,  1987;Jackson  et  ah,  1992). 
The  valve-turning  work-sample  test,  and  three  isometric  strength  tests  (grip,  arm  lift,  and  torso  lift) 
were  administered  to  26  men  and  25  women.  The  zero-order  correlation  between  the  tests  was 
0.82.  Because  of  the  bimodal  shape  of  the  valve-tuming  distribution,  a  logistic  regression  model 
(Pedhauzur,  1997)  defined  the  probability  of  completing  the  test  by  levels  of  isometric  strength. 
The  logistic  equations  and  probability  curves  are  published  (Jackson  et  ah,  1992). 

In  a  second  study,  a  task  analysis  questionnaire  completed  by  operators  at  a  major  chemical 
plant  identified  valve  cracking  as  the  most  physically  demanding  work  task  (Jackson  et  ah,  1990). 
An  electronic  load  cell  measured  the  peak  cracking  torque  on  217  randomly  selected  valves  in  the 
plant.  The  sampled  valves  included  those  with  horizontal  and  vertical  orientations,  positioned  close 
to  the  ground  and  overhead,  those  in  awkward  or  hard  to  reach  positions,  and  valves  of  various 
sizes.  The  results  of  this  biomechanical  job  analysis  showed  that  100  pounds  of  force  applied  to  the 
end  of  a  36-inch  valve  wrench  generated  sufficient  torque  to  crack  93  percent  of  the  plant  valves. 

A  valve-crackingwork-sample  test  simulated  cracking  valves  in  eight  different  ways.  The  eight 
cracking  torques  were  obtained  by  varying  the  action  (push  and  pull),  direction  (horizontal  and  ver¬ 
tical),  and  height  (high  and  low).  A  computerized  torque  wrench  measured  the  torque  applied  to 
four  nuts  placed  in  vertical  and  horizontal  positions  at  two  heights. 

The  valve-cracking  test  and  isometric  strength  tests  (grip,  arm  lift,  and  torso  lift)  were  admin¬ 
istered  to  1 18  men  and  66  women.  The  intercorrelations  among  the  eight  measures  of  valve- crack¬ 
ing  torque  were  high,  ranging  from  0.66  to  0.89.  Because  of  the  high  intercorrelations,  the  eight 
valve-cracking  scores  were  averaged  and  used  as  the  work-sample  measure.  The  correlation  between 
the  sum  of  the  three  isometric  strength  tests  and  average  valve-cracking  torque  was  0.65.  A  logis¬ 
tic  regression  equation  (Pedhauzur,  1997)  defined  a  probability  model  for  estimating  the  chances 
of  generating  the  100-pound  criterion  for  levels  of  isometric  strength.  These  data  are  published 
elsewhere  (Jackson  et  al.,  1992). 
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Electrical  Transmission  Lineworkers 


Doolittle  and  associates  (Doolittle  et  al.,  1988)  developed  a  preemployment  test  for  selecting 
electrical  transmission  lineworkers.  The  study  included  an  extensivejob  analysis  of  electrical  trans¬ 
mission  lineworker  jobs.  The  initial  stage  of  the  task  analysis  surveyed  workers  using  scales 
designed  to  answer  three  questions — 

1.  How  often  was  each  task  performed? 

2.  How  much  time  was  spent  completing  each  task? 

3.  How  physically  demanding  was  each  task  for  the  individual? 

The  identified  critical,  physically  demanding  tasks  were  studied  in  detail  to  define  the  forces 
needed  to  perform  them  safely  and  efficiently.  This  involved  defining  standard  anatomical  move¬ 
ments  for  lifting,  pushing,  and  hoisting;  measuring  the  masses  lifted  and  forces  exerted;  and  esti¬ 
mating  the  metabolic  costs  of  various  work  tasks. 

Using  the  task  analysis  data,  5  strength  tests  that  duplicated  the  muscular  actions  were  selected 
and  administered  to  48  incumbents.  The  tests  required  the  subject  to  move  a  weight  that  represent¬ 
ed  loads  that  linemen  moved.  The  weights  ranged  from  7  to  6 1  kilograms.  The  final  two  tests  select¬ 
ed  were  chin-ups  and  VC^max  estimated  from  bench  stepping  and  exercise  heart  rate.  The  seven  tests 
were  combined  into  a  single  performance  measure.  Criterion-related  validity  was  examined  by  com¬ 
paring  physical  test  performance  with  two  criteria,  supervisor  ratings  and  accident  rates.  The  crew 
chiefs  confidentially  evaluated  each  incumbent  on  the  following  six  dimensions  ofjob  performance  — 

1.  productivity, 

2.  working  with  others, 

3.  supervision, 

4.  safety, 

5.  physical  ability,  and 

6.  technical  skills. 

The  correlations  between  the  composite  physical  test  criteria  of  supervisor  ratings  and  lost  work 
days  because  of  on-the-job  injuries  averaged  over  5  years  were  0.59  and  0.46. 


Diver  Training 

Two  validation  studies  (Gunderson,  Rahe,  &Arthur,  1972;  Hogan,  1985)  were  designed  to  esti¬ 
mate  successful  completion  of  Military  underwater  diver  training  programs.  Gunderson  and  associ¬ 
ates  (Gunderson  et  al.,  1972)used  successfulcompletion  of  underwater  demolition  training  as  the  cri¬ 
terion  of  performance.  They  found  a  multiple  correlation  of  0.54  between  success  defined  by  the  com¬ 
pletion  of  training  and  five  variables,  squat-jumps,  pull-ups,  sit-ups,  body  weight,  and  the  Cornell 
Medical  Index.  Using  these  tests,  they  predicted  about  70  percent  of  those  who  passed  training. 
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Hogan  (Hogan,  1985)  used  46  male,  naval  personnel  who  volunteered  for  diver  training.  The 
first  criteria  was  success  included  nine  performance  rating  scales  that  reflected  physical  condition, 
swimming  training,  leadership  potential,  teamwork,  and  overall  performance.  The  second  criteria 
was  successful  completion  of  training.  The  predictor  measures  included  3  anthropometric  measure¬ 
ments  and  23  fitness  tests.  Hogan  reported  a  multiple  correlation  of  0.63  between  the  average  per¬ 
formance  rating  and  three  physical  tests,  1 -mile  run,  sit  and  reach,  and  muscular  endurance  meas¬ 
ured  with  an  arm  ergometer.The  multiple  correlation  between  these  three  tests  and  successful  com¬ 
pletion  of  the  course  was  0.64.  Hogan  suggested  that  the  validity  coefficients  were  likely  an  overes¬ 
timate  because  of  an  unfavorable  ratio  of  the  number  variables  and  subjects  (Pedhauzur,  1997). 

Demanding  Military  Jobs 

The  U.S.  Military  Services  examined  methods  of  matching  enlisted  personnel  with  physically 
demanding  jobs.  The  U.S.  Air  Force  adopted  a  pre-induction  dynamic  one-repetition  maximum 
(1-RM)  strength  test  (Ayoub  et  al.,  1982). The  U.S.  Army  and  U.S.  Navy  examined  the  relation¬ 
ship  between  body  composition  variables  and  physically  demanding  work  tasks  (Marriott  & 
Grumstrup-Scott,  1992). 

The  U.S.  Air  Force  developed  a  Strength  Aptitude  Test  (SAT)  to  match  the  general  strength 
abilities  of  individuals  with  the  specific  strength  requirements  of  U.S.  Air  Force  jobs  filled  by  enlist¬ 
ed  personnel  (Ayoub  et  al.,  1982). The  U.S.  Air  Force  SAT  measures  the  subject’s  voluntary  1-RM 
lift  to  a  height  of  6  feet.  The  SAT  starts  with  a  40-pound  lift.  The  lift  load  is  increased  by  10 
pounds  until  the  subject  reaches  his  or  her  maximum  voluntary  lift  or  a  maximum  weight  of  200 
pounds.  The  SAT  is  administered  to  U.S.  Air  Force  recruits  as  part  of  their  pre-induction  physical 
examination.  Each  enlisted  U.S.  Air  Force  career  field  has  a  prerequisite  SAT  cut-score. 

An  area  of  concern  expressed  by  the  Committee  on  Military  Nutrition  Research  of  the  Institute 
of  Medicine,  National  Academy  of  Sciences,  is  the  role  body  composition  plays  in  physical  per¬ 
formance.  This  relationship  is  important  not  only  for  making  decisions  about  acceptance  or  rejec¬ 
tion  of  recruits  for  the  Military  Service  but  also  for  retention  and  advancement  while  in  the  Service 
(Marriott  6c  Grumpstrup-Scott,  1992).  Hodgdon  and  associates  (Hodgdon,  1992)  examined  the 
relationship  between  body  composition,  fitness,  and  materials-handling  tasks  required  of  naval 
enlisted  men.  The  two  materials-handling  tasks  were  the  maximum  box  weight  that  could  be  lift¬ 
ed  to  elbow  height  and  the  total  distance  a  34-kilogram  box  could  be  carried  during  two,  5-minute 
workouts.  The  variables  most  highly  correlated  with  maximum  box  lift  were  push-ups  (r  =  0.63) 
and  fat-free  mass  (r  =  0.80). The  variables  most  highly  correlated  with  the  box  carry  test  were  push¬ 
ups  (r  =  0.56),  1.5-mile  run  time  (r  =  -0.67),  and  fat-free  mass  (r  =  0.44).  Fat-free  mass  was  high¬ 
ly  correlated  with  muscular  strength  measures,  suggesting  the  possibility  of  using  fat-free  mass  as 
an  approximation  of  general  strength  in  job  assignment. 

Vogel  and  Friedl  (Vogel  6c  Friedl,  1992)  examined  the  relationship  between  body  composition 
and  absolute  lifting  capacity.  They  reported  significant  correlations  between  maximum  lifting 
capacity  and  fat-free  mass  for  male  and  female  soldiers.  Although  they  did  not  test  for  homogene¬ 
ity  of  male  and  female  regression  lines,  they  published  separate  equations  for  men  and  women. 
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A  limitation  of  Military  testing  programs  is  the  lack  of  job-related  materials-handling  per¬ 
formance  tests.  While  recognizing  the  need  to  develop  content-valid  tests,  the  Committee  on 
Military  Nutrition  Research  concluded  that  there  was  a  direct  relationship  between  Military  mate¬ 
rials-handling  tasks  and  fat-free  mass.  In  view  of  this  relationship  and  the  lack  of  job-related  tests, 
the  Military  should  seriously  consider  establishing  a  minimum  standard  for  fat-free  mass  (Marriott 
&  Grumpstrup-Scott,  1992).  Such  a  recommendation  might  be  implemented  for  the  Military,  but 
using  body  composition  variables  in  pre-employment  tests  in  the  private  sector  would  likely  meet 
an  immediate  legal  challenge. 

Rayson  and  associates  (Raysonet  al. ,2000a;  Rayson  et  al. ,2000b)  completed  a  major  criterion- 
related  validation  study  for  the  British  army.  They  examined  the  effectiveness  of  the  British  army’s 
Physical  Standards  for  Recruits  (PSS(R))  in  predicting  criteria  measuring  recmit  success  in  basic 
training.  The  PSS(R)  consisted  of  tests  measuring  body  mass,  body  composition,  strength,  and 
endurance.  The  criteria  included — 

1.  four  representative  Military  tasks  (RMT)  consisting  of  a  single  lift,  carry,  repetitive  lift, 
and  loaded  march, 

2.  the  days  lost  to  injury  and  sickness  during  basic  training, 

3.  degree  of  success  of  basic  training,  and 

4.  job  performance  ratings  by  self,  peer,  and  supervisor. 

The  PSS(R)  tests  were  administered  to  more  than  l,000recmits  (770  males  and  239  females)  prior  to 
starting  basic  training,  and  the  armyjob  performance  criteria  were  obtained  at  the  end  ofbasic  training. 

The  PSS(R)  tests  correctly  predicted  outcomes  on  the  RMTs  for  74.9  percent  of  the  recruits, 
of  which  58.7  percent  were  true  positives  and  16.2  percent  were  true  negatives.  Of  the  25.1  percent 
misclassified,  15.5percent  were  false  positives  and  9.6  percent  were  false  negatives. The  false  neg¬ 
atives  were  those  recmits  predicted  by  the  PSS(R)  tests  to  fail  the  four  RMTs  when  they  did  pass 
the  tasks.  Although  data  were  not  presented,  the  authors  indicated  that  most  of  the  female  mis- 
classifications  were  false  positives,  “...women  being  incorrectly  accepted  rather  than  incorrectly 
rejected  from  the  army.”  A  significant  relationship  was  found  between  training  outcome  and  pass¬ 
ing  the  PSS(R)  tests.  Additionally,  the  PSS(R)  tests  were  significantly  related  to  days  lost  because 
of  injury  and  sickness  during  basic  training.  Those  recmits  who  failed  their  selection  outcome  lost 
a  median  of  2  days  compared  with  no  days  for  the  recmits  who  passed.  Although  not  statistically 
significant,  the  performance  ratings  of  those  who  failed  the  selection  tests  were  consistently  lower 
then  those  who  passed  the  tests.  The  authors  concluded  that  the  PSS(R)  were  valid,  useful  predic¬ 
tors  of  British  army  performance. 

Manual  Lifting  Tasks 

Manual  lifting  tasks  are  common  elements  of  manyjobs.  Manual  lifting  tasks  have  been  stud¬ 
ied  extensively.  The  reason  for  this  popularity  is  the  large  number  of  job  that  include  materials- 
handling  tasks  and  the  injury  risk  associated  with  lifting.  It  is  estimated  that  about  50  percent  of 
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all  industrial  back  injuries  are  caused  by  lifting,  and  about  67  percent  of  the  injuries  are  caused  by 
lifting  loads  that  are  too  difficult  for  industrial  workers  (Snook  et  al.,  1978). 

An  established  ergonomic  injury-reduction  strategy  is  to  match  the  worker  with  the  demands 
of  the  lifting  task.  One  major  approach  is  to  engineer  the  stress  out  of  the  task.  This  approach 
defines  the  lift  weights  that  are  within  the  physiological  capacity  of  most  industrial  workers 
(Ayoub,  1982).  The  first  research-based  strategy  used  psychophysical  methods  to  define  the  lift 
weight  perceived  as  acceptable  to  75  percent  of  industrial  workers.  Snook  and  associates  (Snook  6c 
Ciriello,  1974;  Snook  6t  Ciriello,  1991;  Snook,  Irvine,  6c  Bass,  1970)  published  separate  standards 
for  males  and  females. The  maximum  acceptable  lift  weight  for  females  was  about  50  percent  of  the 
lift  weights  for  males.  A  newer  strategy  is  the  use  of  the  NIOSH  multiplicative  equations  (NIOSH, 
1981;  Waters  et  al.,  1993)  that  consider  several  different  lift  difficulty  parameters.  The  NIOSH 
equations  extend  the  Snook  and  associates’ psychophysical  methodology  by  also  using  biomechan¬ 
ical  and  physiological  criteria  to  define  recommended  weight  of  lift  (RWL).  The  newest  NIOSH 
equation  (Waters  et  al.,  1993)  defines  a  RWL  that  would  be  acceptable  to  75  percent  of  the  female 
industrial  population.  Using  the  75th  percentile  female  as  the  RWL  criterion  produces  a  conserva¬ 
tive  estimate.  The  RWL  for  the  common  floor  to  knuckle  lift  at  a  frequency  of  one  lift  every  30 
minutes,  for  example,  is  only  lOkilograms  or  22  pounds  (Waters  et  al.,  1993). 

The  NIOSH  equation  focuses  onjob  design,  i.e.,  defining  a  RWL  for  most  male  (99  percent) 
and  female  (75  percent)  industrial  workers  for  all  ages  in  the  workforce.  A  limitation  of  the  NIOSH 
equation  is  that  it  does  not  consider  individual  differences  in  physiological  capacity  of  workers. 
Many  common  materials-handling  tasks  exceed  the  NIOSH  equation’s  RWL  estimates.  The  sec¬ 
ond  ergonomic  method  of  matching  the  worker  with  the  demands  of  job  is  to  select  individuals 
with  the  physiological  capacity  to  do  the  job  with  a  margin  of  safety  (Ayoub,  1982;  Keyserling  6t 
al.,  1980;  NIOSH,  1977). 

The  content-validation  method  is  often  used  to  validate  materials-handling  tests.  A  content- 
valid  test  would  be  to  have  the  applicant  perform  the  task,  e.g.,  lift  a  90-poundjackhammer  and 
transport  it  a  specified  distance.  Although  this  type  of  test  would  be  content  valid,  it  has  two  lim¬ 
itations.  First,  it  is  not  possible  to  determine  one’s  maximum  capacity.  Second,  motivated  applicants 
without  the  physiological  capacity  demanded  by  the  task  place  themselves  at  risk  of  injury  (Ayoub, 
1982).  One  of  the  first  ergonomic  approaches  used  to  overcome  these  limitations  was  to  use  iso¬ 
metric  strength  tests  that  duplicated  the  position  assumed  by  the  worker  to  do  the  lift.  These  posi¬ 
tion-specific  strength  data  were  used  to  determine  whether  an  applicant  had  sufficient  strength 
capacity  to  do  the  work  with  a  margin  of  safety  (Keyserling  6o  al.,  1980;  Keyserling  et  al.,  1980). 

Gilliam  and  Lund  (2000)  examined  the  effects  on  work-related  injuries  of  physiologically 
matching  workers  to  the  demands  of  the  job.  Isokinetic  strength  was  measured  on  365  applicants 
for  truck  driver  and  dockworker  jobs.  The  isokinetic  data  were  used  to  generate  a  Department  of 
Labor  Dictionary  of  Occupational  Titles  strength  rating.  This  rating  was  used  to  select  applicants 
who  matched  the  physical  demands  of  thejob.  Of  the  365  applicants,  276  matched  thejob  demands 
and  were  hired.  The  89  applicants  who  did  not  match  were  not  hired.  Those  hired  were  significant¬ 
ly  stronger  then  those  who  were  not  hired.  In  addition,  those  not  hired  were  significantly  heavier 
then  those  hired.  Those  not  hired  were  44  pounds  heavier  then  the  new  hires.The  injury  rates  of  the 
strength-matched  new  hires  were  compared  with  historical  data  on  workers  matched  for  employ¬ 
ment  duration.  The  overexertion  injury  rates  to  the  knees,  shoulders,  and  back  were  1.04  for  the 
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strength-matched  workers  compared  with  16.7  for  the  non- matched  workers,  suggesting  that  pre¬ 
employment  screening  is  effective  in  reducing  injury.  Although  not  examined,  these  results  also  sug¬ 
gest  that  body  composition  may  also  have  been  a  factor.  A  strength-weight  profile  of  weaker  and 
heavier  versus  stronger  and  lighter  suggest  a  difference  in  percent  body  fat.  The  stronger-lighter  pro¬ 
file  is  consistent  with  a  lower  percent  body  fat,  which  also  might  have  been  an  injury  risk  factor. 

Another  physiological  approach  to  matching  the  worker  to  the  demands  of  the  job  is  to  use  stan¬ 
dard  strength  tests  to  assess  an  individual’ sphysiological  capacity  and  use  regression  models  to  define 
the  probability  of  being  able  to  complete  a  lift  (Jackson  6t  Sekula,  1999;  Jackson,  Borg,  Zhang, 
Laughery,  &  Chen,  1997).  This  approach  was  used  to  study  hospital  workers  involved  with  lifting 
and  transporting  patients.  An  analysis  of  hospital  jobs  documented  that  patient  lifting  was  a 
demanding  lift  task  (Jackson,  Osburn,  Laughery,  Young,  6c  Zhang,  1994).  Patient  lift  tasks  are  a 
major  source  of  injury  to  the  lifter  (Garg  &Owen,  1992). The  lift  dimensions  of  the  most  common 
single -personpatient  lift  were  used  to  devise  a  work-sample  lift  test.  The  most  common  patient  lift 
task  is  lifting  a  patient  who  is  sitting  in  a  chair.The  simulated  lift  test  consisted  of  lifting  a  box  from 
a  height  of  53  cm  to  a  height  of  48  cm.  The  hand  position  at  the  start  of  the  lift  was  at  a  height  that 
the  lifter  would  grab  a  patient  sitting  in  a  chair.  The  lift  task  consisted  of  lifting  seven  loads  ranging 
in  weight  from  15  to  90  pounds.  The  subjects  lifted  those  loads  that  were  within  their  capacity  and 
rated  lift  difficulty  with  Borg’s  CR-10  psychophysical  scale  (Borg,  1982;  Borg,  1998).  Logistic 
regression  analysis  of  the  data  on  58  female  and  33  male  subjects  showed  that  the  capacity  to  com¬ 
plete  a  lift  depended  on  the  lifter’ sphysiological  capacity  sampled  by  his  or  her  isometric  strength 
and  fat-free  mass.  Further  analyses  showed  that  the  subject’s  CR— 10  rating  of  each  lift  was  signifi¬ 
cantly  correlated  with  isometric  arm,  shoulder,  torso,  and  leg  strength,  and  fat-free  weight. 

The  results  of  the  patient  lift  study  suggested  that  lift  weight  and  the  physiological  capacity  of 
the  lifter  could  be  used  to  develop  a  generalized  lift  model.  The  second  study  examined  the  role  of 
lift  load,  strength,  and  gender  on  psychophysical  lift  capacity  (Jackson,  1999).  A  floor-to-knuckle 
lift  test  was  administered  to  209  men  and  1 8 1  women.  The  task  involved  lifting  loads  ranging  from 
22  to  143pounds.  The  subject  started  with  a  light  lift  load  and  continued  to  lift  heavier  loads  until 
either  the  heaviest  load  was  lifted  or  the  subject  failed  the  lift.  The  load  increased  at  a  linear  rate  of 
llpounds.  After  each  completed  lift,  the  subject  rated  the  lift  difficulty  with  Borg’s  CR-10  scale 
(Borg,  1998).  The  subject’s  physiological  strength  capacity  was  measured  with  basic  isometric 
strength  tests  (Baumgartner  &Jackson,  1999).  Each  subject’s  dynamic  lift  profile  was  defined  with 
a  power  function  regression  equation  using  the  completed  lift  weight  as  the  independent  variable 
and  the  CR-  lOrating  as  the  dependent  variable.  Using  the  power  function  regression  equation,  one 
lift  weight  and  the  associated  CR— 10  rating  were  randomly  selected  for  each  subject.  This  created 
a  distribution  of  lift  weights  and  associated  psychophysical  ratings  ranging  from  very  easy  to  the 
maximum  within  the  subject’s  psychophysical  capacity.  Multiple  regression  provided  an  equation 
with  a  function  to  estimate  psychophysical  lift  difficulty  from  lift  load,  strength,  and  the  gender- 
by-weight  load  interaction.  The  multiple  correlation  for  the  model  was  0.81,  with  a  standard  error 
of  1.7  CR— 10  units.  The  derived  equation  provided  a  model  that  defined  the  psychophysical  lift 
demands  of  common  industrial  weight  loads  for  individuals  who  differed  in  physiological  capacity. 

The  psychophysical  modeling  of  industrial  lift  tasks  not  only  provides  evidence  concerning  an 
individual’s  probability  of  being  able  to  complete  a  lift  but  also  psychophysical  stress.  The  psy¬ 
chophysical  demand  of  a  lift  task  is  related  to  the  risk  of  back  injury  (Herrin,  1986;  Liles,  1984; 
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Snook  et  al,  1978).  Lifting  loads  psychophysicallyjudged  to  be  difficult  increases  the  risk  of  injury. 
Psychophysical  ratings  provide  an  index  of  relative  demand  for  the  individual.  Resnik  (1995)  pres¬ 
ents  preliminary  data  showing  that  Borg’s  psychophysical  rating  can  be  interpreted  by  the  physio¬ 
logical  significant  scale  of  percentage  of  maximum  capacity.  With  a  sample  of  254  male  and  354 
female  subjects,  a  correlation  of  0.91  was  obtained  between  Borg’s  CR- 10  rating  and  the  subject’s 
maximum  function  lift  capacity  (Sekula,  Jackson,  &Laughlin,  under  review).  Maximum  function¬ 
al  lift  capacity  was  the  subject’s  percentage  of  maximum  lift,  where  maximum  lift  represented  the 
weight  load  equal  to  the  subject’s  Borg  psychophysical  CR- 10  rating  of  10.  A  regression  equation 
was  developed  to  convert  CR-10  ratings  into  the  metric  of  percentage  of  max.  The  standard  error 
of  estimate  for  the  linear  equation  was  8.5  percent  max.  This  research  could  provide  researchers 
with  the  capacity  to  interpret  psychophysically  defined  lift  loads  with  the  well-established  physio¬ 
logical  intensity  metric  of  percentage  of  maximum  capacity. 


Summary 


In  summary,  the  Uniform  Guidelines  require  validity  studies  to  be  carried  out  whenever  there  is 
a  need  to  continue  selection  practices  that  lead  to  adverse  impacts.  Three  types  of  validity  studies  are 
recognized:  content-validity,  criterion-related  validity,  and  construct-validity  studies.  The  guidelines 
require  all  validity  studies  to  be  carried  out  in  a  responsible,  scientifically  sound  manner,  and  call  for 
the  use  of  goodjudgment  in  the  implementation  of  selection  procedures. The  EEOC  is  waiting  for 
developments  in  the  field  before  it  completely  endorses  construct-validity  studies.  A  major  differ¬ 
ence  in  physical  test  validation  is  the  use  of  physiological  rather  then  psychological  tests.The  goal  of 
physiological  validation  is  to  define  the  physiological  capacity  needed  by  a  worker  to  perform  the 
work  demanded  by  the  task.  Principal  features  of  the  physiological  validation  approach  are  the  use 
of  a  physiological  metric  to  quantify  test  performance  and  the  interpretation  of  validity  results  using 
relevant  physiological  research  and  theory. These  data  are  used  to  develop  physiologically  sound  cut- 
scores.  Although  numerous  physical  test  validation  studies  have  been  completed,  most  are  not  pub¬ 
lished.  The  results  of  those  published  shows  that  physical  tests  can  be  used  to  select  workers  with  the 
physiological  capacity  to  do  demandingjobs.  Ergonomic  research  shows  that  selecting  workers  with 
the  physiological  capacity  to  do  the  work  reduces  the  risk  of  work-related  injuries. 


Endnotes 


Y  —  criterion 


1.  The  probability  can  be  estimated  with  the  following  equation:  z  =  — — -j — -4  f  ..  ,  , 

r  J  m  standard  error  of  estimate y 

where  Y’  is  the  estimated  criterion  score  and  the  criterion  is  the  desired  value,  in  this  example,  100. 

Once  the  z-score  is  obtained,  a  table  of  normal  curves  can  be  used  to  estimate  the  proportion  of  sub¬ 
jects  that  can  be  expected  to  exceed  the  criterion  for  a  given  strength  level. 

2.  Personal  communication  between  A.  Jackson,  University  of  Houston,  and  Dr.  John  Hater  of  the  Fedex 
Corporation.  Engineers  used  the  power  output  data  in  Figure  5.3  to  estimate  expected  changes  in  pro¬ 
ductivity  produced  by  changes  the  physiological  capacity  of  the  workforce. 
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3.  This  review  was  initially  published  in  1994  by  one  of  the  authors  of  this  chapter  (Jackson,  1994)and 
expanded  to  include  studies  published  since  that  time. 
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Abstract 


In  the  employment  setting,  test  scores  may  be  used  to  determine  and  predict  acceptablejob  per¬ 
formance.  This  chapter  focuses  on  the  methodologies  used  to  establish  passing  scores  for  tests  that 
identify  individuals  who  are  able  to  perform,  or  be  trained  to  perform,  the  essentialjob  tasks.  The 
methodologies  discussed  focus  on  the  data  generated  when  content  and  criterion-related  validity 
strategies  are  used  to  identify  legally  defensible  passing  scores. 

The  design  of  effective  criterion  measures  that  assess  and  differentiate  levels  of  job  performance 
are  discussed.  These  criterion  data,  along  with  test  scores  and  validity  coefficients,  are  used  to  for¬ 
mulate  passing  scores  that  identify  successful  and  unsuccessful  candidates.  Methods  are  explained 
that  assess  whether  a  passing  score  maximizes  correct  testing  decisions,  while  also  minimizing  test¬ 
ing  errors.  These  methods  include  expectancy  tables,  contingency  tables,  and  Taylor-Russell  tables. 
The  use  of  ergonomic  and  normative  data  for  setting  standards  is  also  discussed. 

Issues  related  to  test  fairness  and  adverse  impact,  and  their  integration  with  legal  requirements, 
are  outlined.  The  effect  of  basic  physiological  tests  (e.g.,  aerobic  capacity,  strength  tests)  and  job 
simulations  on  the  reduction  of  adverse  impact  is  also  shown  by  using  comparisons  from  a  variety 
of  physically  demanding  jobs.  Finally,  the  computation  of  test  fairness  is  described,  along  with  its 
relationship  to  adverse  impact  and  test  utility. 
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Assessment  procedures  may  be  used  for  a  variety  of  purposes  such  as  selection,  classification, 
and  placement.  When  assessment  procedures  are  used  in  a  selection  setting,  an  applicant  is  typi¬ 
cally  accepted  or  rejected  based  on  test  scores.  Single  or  multiple  test  scores  can  be  used  to  deter¬ 
mine  whether  an  applicant  has  the  ability  to  perform  the  job  tasks.  For  example,  an  individual 
applying  for  a  police  officer  position  may  take  a  physical  performance  test  that  requires  attainment 
of  a  specific  passing  score.  All  individuals  who  achieve  that  score  are  considered  acceptable  and  eli¬ 
gible  for  hire,  while  those  who  do  not  are  eliminated  from  the  selection  process. 

In  contrast,  test  scores  used  for  classification  and  placement  allow  for  assignment  of  individuals 
to  different  groupings  or  categories  (e.g.,  levels  of  aerobic  capacity).  A  woman’s  aerobic  capacity, for 
example,  can  be  used  to  categorize  her  fitness  level  using  normative  data.  A  39-year-old  woman  who 
achieves  a  VC^max  score  of  35  ml'kg  1*min  1  would  be  classified  in  the  60'1’  percentile,  or  “above 
average,”  while  a  20-year-old  woman  with  the  same  score  would  be  classified  between  the  40'1"  and 
50”  percentiles  or  “below  average”  fitness  level  (American  College  of  Sports  Medicine,  2000; 
Golding,  Myers,  8t  Sinning,  1989). These  types  of  classifications  can  be  used  to  assign  individuals 
to  work  or  training  groups,  but  can  not  be  used  for  selection  purposes  (Civil  Rights  Act  of  1991). 

This  chapter  focuses  on  how  to  identify  specific  test  scores  for  use  in  inaking  employment  deci¬ 
sions.  This  process  includes  identifying  minimum  job  requirements  for  new  hires,  for  promotion  to 
jobs  with  unique  physical  demands,  and  for  retention  of  incumbent  personnel.  The  discussion 
focuses  on  establishing  specific  test  scores  that  are  predictive  of  acceptable  job  performance.  The 
specific  minimum  scores  that  identify  individuals  who  pass  a  test  are  called  by  a  variety  of  names: 
cutoff  score,  cut-score,  passing  score,  or  perforinance  standard.  The  term  passing  score  is  used  in 
this  chapter  to  refer  to  a  test  score  that  is  indicative  of  acceptable  performance. 


Requirements  for  Identifying  Passing  Scores 


The  purpose  of  using  passing  scores  or  standards  is  to  identify  individuals  who  are  able  to  per¬ 
form,  or  be  trained  to  perform,  essentialjob  tasks.  Failure  to  meet  the  established  score  can  result 
in  an  individual  not  being  hired  or  promoted,  or  having  to  retake  the  test.  The  Equal  Employment 
Opportunity  Commission  (EEOC)  Uniform  Guidelines  on  Employee  Selection  Procedures 
(1978)  indicate  that  passing  scores".  ..should  normally  be  set  so  as  to  be  reasonable  and  consistent 
with  normal  expectation  of  acceptable  proficiency  within  the  work  force”  (Section  5H). Although 
the  EEOC  Uniform  Guidelines  (1978)  do  not  specifically  define  the  terms  “reasonable  and  con¬ 
sistent”  (Section  5H),  they  do  allow  organizations  to  establish  and  interpret  passing  scores  in  rela¬ 
tion  to  their  specific  goals  such  as  staff  utilization,  growth,  and  profit. 

Other  Federal  statutes,  such  as  the  Civil  Rights  Act  of  1964  (Title  VII)  and  1991  (CRA  1964, 
1991)  and  the  Americans  with  Disabilities  Act  of  1990,  prohibit  discrimination  against  protected 
groups  (e.g.,  minorities).  The  CRA  of  1964  prohibits  discrimination  based  on  race,  color,  gender, 
national  origin,  or  religion.  The  result  of  Title  VII  of  the  CRA  of  1964  is  that  tests  must  be  valid 
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and  fair  for  legally  protected  groups.  The  CRA  of  1991  further  stipulates  that  separate  passing 
scores  for  subgroups  (e.g.,  race,  gender)  shall  not  be  used  for  assessments  that  affect  employment 
standing  (e.g.,  selection,  promotion).  These  statutes  do  not  state  that  the  tests  cannot  have  adverse 
impact.  Rather,  tests  that  have  adverse  impact  are  acceptable  if  they  are  valid  andjob  relevant. 

Prior  to  determining  a  passing  score  for  a  test,  one  must  ensure  the  test  is  valid.  That  is,  does 
the  test  battery  measure  “what  it  is  purported  to  measure”  (Anastasi,  1996)?  Three  methods  or 
experimental  designs  can  be  used  to  establish  validity:  content,  criterion-related,  or  construct  valid¬ 
ity  (American  Educational  Research  Association,  American  Psychological  Association,  and 
National  Council  on  Measurement  in  Education,  1999). 

Content  validity  requires  that  the  test  or  test  components  sample  the  content  of  the  job.  The 
test  contains  simulations  of  essential  job  tasks  identified  in  the  job  analysis.  For  example,  a  dock- 
worker  test  may  involve  stacking  and  unstacking  cargo  in  a  specified  time  frame. 

A  criterion-related  validity  strategy  empirically  assesses  the  extent  to  which  a  test  is  related  to 
a  measure  of  job  performance  called  a  criterion  measure.  Two  types  of  criterion-related  validity 
designs  can  be  used.  The  first,  concurrent  validity,  correlates  existing  workers’  (incumbents)  scores 
on  the  test  (predictor)  with  a  measure  of  job  performance  (criterion).  The  second  design,  predictive 
validity,  involves  administering  the  test  tojob  applicants  and  obtaining  a  measure  ofjob  perform¬ 
ance  at  a  future  date.  Due  to  practicality  issues,  the  concurrent  validity  design  is  used  most  often. 

Finally,  construct  validity  measures  the  extent  to  which  the  test  measures  the  trait  or  ability  (e.g., 
aerobic  capacity)  identified  in  thejob  analysis  as  critical  to  effectivejob  performance.  For  example, 
muscular  strength  may  be  considered  a  construct  and  can  be  measured  with  a  variety  of  strength 
tests.  For  construct  validity  to  be  present,  the  correlation  between  the  test  used  to  assess  muscular 
strength  and  other  tests  known  to  assess  the  same  ability  should  be  high.  Further,  correlations 
between  muscular  strength  tests  and  tests  of  other  abilities  (e.g.,  equilibrium)  should  be  lower. 


Methods  for  Determining  Passing  Scores  or  Standards 

A  variety  of  methods  are  used  to  identify  valid  passing  scores.  The  methodology  employed 
depends  on  the  type  of  validity  strategy  used  (e.g.,  content,  criterion-related).  The  methods  are 
based  on  decision  theory  and  establishing  the  usefulness  of  the  testing  procedure.  Although  there 
are  a  variety  of  techniques,  only  the  following  will  be  addressed  in  this  chapter. 

1.  Expectancy  tables 

2.  Contingency  tables 

3.  Taylor  Russell  tables 

4.  Normative  data 

5.  Ergonomic  data 

To  use  Taylor  Russell,  expectancy,  or  contingency  tables,  two  or  three  of  the  following  param¬ 
eters  must  be  available — 
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1.  test  scores, 

2.  measures  of  job  performance  or  criterion  measure  data,  and 

3.  a  validity  coefficient,  which  is  the  correlation  coefficient  between  the  test  score(s) 
and  the  criterion  measure. 

T  o  illustrate  the  use  of  these  three  types  of  tables,  a  criterion-related  validity  approach  is  used 
to  obtain  the  needed  information.  Use  of  ergonomic  and  normative  data  is  addressed  separately. 

Formulation  of  Criterion  Measures 

Prior  to  determining  the  usefulness  of  a  test  and  identifying  a  passing  score  when  using  Taylor- 
Russell,  expectancy,  or  contingency  tables,  a  criterion  measure  or  job  performance  measure  must  be 
developed.  The  criterion  measure  is  as  important  as  the  test  in  a  criterion-related  validity  study  Criterion 
measures  should  represent  the  important  components  of  the  job  (e.g.,  shovel  gravel  for  30  minutes)  and 
be  measurable.  They  must  be  reliable  and  discriminate  among  different  levels  of  performance  across 
individuals.  Finally,  the  criterion  measure  should  define  a  level  of  acceptablejob  performance. 

There  are  a  variety  of  reliable  criterion  measure  formats.  Two  commonly  used  criterion  measures 
are  ratings  ofjob  performance  and  work  samples  that  simulate  actualjob  tasks  (Gebhardt,  Baker,  &c 
Sheppard,  1998b;  Gebhardt,  Baker,  Sheppard,  6t  de  Miranda,  1994;  Landy  6c  Farr,  1980,1983). 

Ratings  of  Job  Performance  —  Two  frequently  used  rating  formats  are  graphic  scales  and  behav- 
iorally  anchored  rating  scales.  These  scales  are  constructed  to  discriminate  among  different  levels 
of  performance.  Graphic  rating  scales  consist  of  a  set  of  scale  points  (e.g.,  1-10)  with  correspon¬ 
ding  generic,  qualitative  descriptors  on  which  the  rater  marks  the  performance  level  for  each  of  sev¬ 
eral  job  behaviors  (e.g.,  ability  to  lift  50-pounds  repetitively).  Figure  6.1  shows  examples  of  two 
graphic  rating  scales. 


Scale  1 

5  = 

Above  average 

4 

Slightly  above  average 

3 

Average 

2  = 

Below  average 

1  = 

Poor 

Scale  2 

5  = 

Greatly  exceedsjob  requirements 

4 

Exceedsjob  requirements 

3 

Meets  job  requirements 

2 

Meets  minimal  job  requirements,  with  assistance 

1 

Does  not  meet  job  requirements 

Journal  of  Applied  Psychology,  1939,VoL  23f  pp.  565-578, 


Figure  6. 1  Graphic  rating  scale  examples 
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Behaviorally  anchored  rating  scales  or  BARS  are  typically  developed  using  critical  incidents  in 
the  job  that  identify  behaviors  that  are  indicative  of  good,  average,  and  poor  performance 
(Flanagan,  1954;  Smith  &  Kendall,  1963).  Figure  6.2  provides  an  example  of  a  BARS  that  lists  a 
task  and  levels  of  performance  for  paramedics  carrying  a  patient  in  a  stair  chair  (Gebhardt,  Baker, 
Sheppard,  &Leonard,  1999).  Levels  1,  2,  and  3  were  identified  by  supervisors  as  unacceptable  lev¬ 
els  of  performance  for  this  task.  Therefore,  raters  (supervisors  and/or  peers)  who  assign  a  rating  of 
“3” considered  the  person  unacceptable. 


Task  Carry  patient  down  stairs  in  stair  chair. 

6  =  Able  to  descend  3  floors  of  stairs  with  250-lb.  patient  in  stair  chair. 

5  =  Able  to  descend  3  floors  of  stairs  with  220-lb.  patient  in  stair  chair. 

4  =  Able  to  descend  3  floors  of  stairs  with  200-lb.  patient  in  stair  chair. 

3  =  Able  to  descend  3  floors  of  stairs  with  180-lb.  patient  in  stair  chair. 

2  =  Able  to  descend  3  floors  of  stairs  with  160-lb.  patient  in  stair  chair. 

1  =  Able  to  descend  3  floors  of  stairs  with  150-lb.  patient  in  stair  chair. 


Figure  6.2  BARS  scale  example 

A  derivative  of  the  BARS,  called  the  behavioral  observational  scales  (BOS),  measures  the  fre¬ 
quency  of  desired  behaviors  (e.g.,  able  to  lift  patient  loaded  gurney  into  ambulance)  (Latham  &. 
Wexley,  1977).  Similar  to  BARS,  critical  incidents  and  job  behaviors  are  obtained  and  categorized. 
Supervisors  and/or  peers  use  a  frequency-based  scale  to  rate  each  behavior.  Raters  read  the  behav¬ 
ior  and  determine  what  percentile  of  the  time  the  individual  applies  the  behavior  successfully. 
Figure  6.3  presents  an  example  of  a  BOS  and  a  list  of  behaviors  that  were  rated.  Research  has 
shown  that  use  of  more  complicated  rating  scales  such  as  BARS  is  occasionally  better  than  uncom¬ 
plicated  scales  such  as  BOS  or  graphic  (Giffin,  1989).  However,  both  BARS  and  graphic  scales 
have  been  used  with  equal  success  in  the  physical  domain  to  validate  basic  ability  and  job  simula¬ 
tion  tests  (Gebhardt  et  al.,  1998a  8cb,  1999). 


BOS  Scale 

5  =  Engages  in  behavior  95-100%  of  the  time 

4  =  Engages  in  behavior  85-94%  of  the  time 

3  =  Engages  in  behavior  75-84%  of  the  time 

2  =  Engages  in  behavior  65-74%  of  the  time 

1  =  Engages  in  behavior  less  than  65%  of  the  time 


Example  of  Behaviors  Rated  with  BOS 

1.  4  Able  to  lift  patient  from  stretcher  to  ambulance. 

2.  4  Carries  equipment  to  accident  scene  (quick  response  bag,  oxygen  tank,  drug  box) 

3.  2  Able  to  descend  2  floors  of  stairs  while  carrying  patient  in  a  stair  chair. 

Figure  6.3  BOS  example 
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Ratings  of  performance  by  multiple  supervisors,  peers,  or  other  job-knowledgeable  individuals 
using  these  scales  have  been  shown  to  be  reliable  measures  (r  =  0.50  —  0.75)  of  job  performance 
(Gebhardt  et  al.,1994, 1998b;  Mumford,  1983).  Although  supervisor  ratings  of  workers’ behaviors 
are  frequently  used  in  validation  studies,  peer  ratings  have  been  shown  to  be  equally  effective  pre¬ 
dictors  ofjob  success  (Cederbloom,  1989;Gebhardt  et  al.,  1998a,  Gebhardt,  Schemmer,  &  Crump, 
1985).  These  ratings  are  effective  because  both  types  of  raters  have  observed  job  performance  and 
are  familiar  with  what  constitutes  effective  job  performance  of  critical  job  behaviors. 

When  using  ratings  of  performance,  the  researcher  must  be  cognizant  of  the  effects  of  rating  errors. 
There  are  three  main  types  of  errors:  halo,  leniency,and  central  tendency  (King,  Hunter,  &Schmidt,  1980). 

The  halo  effect  indicates  the  rater  has  an  overall  favorable  or  unfavorable  impression  of  the  worker 
and  rates  alljob  tasks  or  behaviors  in  a  manner  consistent  with  this  impression.  Leniency  is  generally 
defined  as  assigning  all  favorable  ratings  due  to  an  unwillingness  to  assign  different  ratings  for  each  task 
or  behavior.  Similarly,  the  error  of  central  tendency  occurs  when  the  rater  uses  only  the  middle  of  the 
scale  and  does  not  assign  ratings  that  are  at  the  scale  extremes  (e.g.,  poor,  excellent). To  use  job  per¬ 
formance  ratings  successfully,  these  errors  must  be  eliminated  or  reduced  by  defining  specificjob  traits 
or  behaviors,  forcing  discrimination  in  ranking  workers,  and  training  all  raters  (Anastasi,  1996). 

Work  Sample  Simulations- — Work  sample  criterion  measures  are  composed  of  essential  job  tasks 
that  simulate  the  work  environment  (e.g.,  equipment).  These  simulations  must  be  constructed  to 
allow  for  quantitative  analysis  of  the  simulated  tasks.  Further,  the  reliability  of  the  simulation  should 
be  determined.  An  example  of  a  measurable  work  simulation  in  the  tire  manufacturing  industry  con¬ 
sisted  of  moving  tires  from  a  simulated  inspection  station  to  a  storage  facility  in  a  specific  time  frame 
(Crump,  Gebhardt,  Guerette,  &  Wertheimer,  1985).  The  type  and  number  of  tires  inspected  and 
moved  in  the  simulation  were  based  on  the  factory  production  schedule.  The  acceptable  level  of  per¬ 
formance  on  the  simulation  was  determined  from  production  data  that  indicated  the  number  of  tires 
moved  in  a  specific  time  frame.  Test-retest  reliabilities  for  this  criterion  measure  and  similar  ones  in 
other  studies  ranged  from  .79  to  .98  (Crump  et  al.,  1985; Gebhardt  et  al.,  1999). 

Regardless  of  the  type  of  criterion  measures  employed  (e.g.,  ratings,  work  samples),  it  is  imper¬ 
ative  that  they  be  reliable  and  measure  a  representative  sample  of  the  job  behaviors.  Further,  an 
acceptable  level  of  performance  on  the  criterion  measure  must  be  determined  in  order  to  use  con¬ 
tingency  and  Taylor-Russell  tables. 

Identification  of  Acceptable  Job  Performance 

To  establish  a  passing  score,  a  point  on  the  criterion  measure  that  differentiates  between  accept¬ 
able  and  unacceptable  job  performance,  must  be  identified.  One  method  of  determining  acceptable 
job  performance  is  to  define  an  acceptable  level  ofjob  performance  in  the  actual  criterion  measure. 
Using  this  method,  a  supervisor  rating  scale  may  define  levels  of  performance  as — 

1.  belowjob  requirements, 

2.  meetsjob  requirements,  or 

3.  exceedsjob  requirements. 
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If  a  supervisor  uses  a  three-point  scale  to  evaluate  10  work  behaviors,  where  acceptable  per¬ 
formance  is  defined  as  level  “2,”  a  worker  would  need  a  total  summed  score  of  20  to  be  considered 
as  having  met  job  requirements  (i.e.,  10  tasks  x  2  [meets  job  requirements]  =  20).  Note  that  the 
supervisor  could  rate  the  worker  as  a  “2”  on  some  tasks  and  a  “l’or  “.?”(>  n  others,  and  the  worker 
would  still  have  a  total  of  20,  indicating  that  the  worker’s  performance  was  acceptable. 

Supervisor  and  peer  ratings  can  also  be  used  to  identify  acceptable  and  unacceptable  perform¬ 
ance  of  a  work  sample  criterion.  For  example,  a  work  sample  criterion- measure  was  developed  that 
simulated  a  series  of  critical  firefighter  tasks  (Sothmann,  Gebhardt,  Baker,  Costello,  &  Sheppard, 
1995).  To  determine  the  acceptable  and  unacceptable  performance  levels  on  the  simulation,  six 
videotapes  were  generated  to  show  six  different  paces  of  movement.  The  paces  were  established 
based  on  statistical  criteria  (e.g.,  mean,  one  standard  deviation  below  mean)  from  a  sample  of 
incumbent  firefighters.  To  determine  the  minimally  acceptable  pace  for  completing  the  simulation, 
a  rating  instrument  was  designed  to  allow  the  raters  to  view  each  pace  and  determine  whether  the 
pace  was  acceptable  or  unacceptable  at  an  actual  fire.  A  random  counterbalanced  design  was  used 
for  presentation  of  the  paces  to  the  raters.  The  results  indicated  that  raters  were  able  to  correctly 
rank  the  paces  from  fast  to  slow,  and  they  demonstrated  a  high  level  of  agreement  on  the  paces  that 
were  acceptable  and  unacceptable.  Further,  the  minimally  acceptable  pace  was  identified  and  used 
as  the  measure  of  acceptablejob  performance.  This  pace  was  then  used  tojustify  the  passing  score 
for  a  firefighter  selection  test. 

Another  method  of  defining  acceptable  job  performance  is  through  the  use  of  ergonomic  and 
production  data.  Gebhardt,  Schemmer,  and  Crump  (1985)  completed  a  study  in  the  longshore 
industry  in  which  one  of  the  essential  tasks  involved  securing  40-foot  containers  to  the  deck  of  a  ship 
with  long  metal  rods  and  turnbuckles.  This  task  required  affixing  the  rods  to  the  container  and  secur¬ 
ing  them  by  attaching  and  turning  a  turnbuckle  until  it  locked  the  rods  in  place.  This  task  was  sim¬ 
ulated  as  a  criterion  measure  in  a  validation  study.  The  level  of  acceptablejob  performance  for  the 
tightening  and  loosening  of  the  turnbuckle  was  obtained  by  measuring  the  force  required  to  “break” 
a  tightened  turnbuckle  and  the  number  of  turns  on  a  turnbuckle  needed  to  secure  a  rod  to  a  con¬ 
tainer.  Individuals  able  to  exert  the  required  force  and  turn  the  turnbuckle  the  required  number  of 
times  in  the  specified  time  period  exhibited  acceptablejob  performance.  The  time  period  and  num¬ 
ber  of  turnbuckles  attached  were  established  from  ergonomic  data.  These  data  were  obtained  for  a 
variety  of  sizes  of  container  ships  and  included  the  time  required  to  lash  (attach  rods  and  turnbuck¬ 
les)  all  containers  to  the  ship  deck,  the  number  of  longshore  workers  assigned  to  a  lashing  operation, 
the  number  of  containers  placed  in  a  stack,  and  the  number  of  container  stacks  lashed  per  row. 

Decision  Models  for  Investigating  Test  Effectiveness  and  Setting 
Passing  Scores 

The  purpose  of  using  tests  and  other  assessment  procedures  is  to  predict  future  job  perform¬ 
ance  by  identifying  individuals  who  can  perform  essentialjob  tasks  in  a  safe  and  effective  manner. 
Individuals  who  pass  a  test  should  have  a  high  probability  ofjob  success  in  an  organization.  If  the 
passing  score  is  set  too  high,  the  test  will  have  adverse  impact  on  protected  groups  and  reject  indi- 
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viduals  who  are  qualified  for  the  job.  If  the  passing  score  is  set  too  low,  very  few  candidates  will  be 
screened  out,  allowing  unqualified  individuals  to  be  hired  and  diminishing  the  test  effectiveness. 

Passing  scores  need  to  meet  multiple  goals.  First,  a  passing  score  should  be  at  a  level  that  reflects 
acceptable  job  performance.  Second,  a  passing  score  should  maximize  correct  testing  decisions  and 
minimize  testing  errors.  Third,  the  adverse  impact  of  the  test  and  its  passing  score  must  be  considered. 
For  example,  a  passing  score  set  at  alow  level  may  have  little  or  no  adverse  impact.  However,  a  passing 
score  set  at  a  high  level  may  result  in  adverse  impact.  Fourth,  the  passing  score  should  be  set  at  a  level 
that  provides  test  utility.  The  primary  puipose  of  a  selection  test  is  to  identify  applicants  who  will  be 
successful  on  the  j  ob  and  screen  out  those  who  will  not .  W  h  a  t  makes  the  task  of  setting  a  passing  score 
more  difficult  is  that  these  goals  are  in  conflict  with  one  another. Thus,  the  challenge  of  setting  an  accu¬ 
rate  passing  score  is  the  integration  of  these  goals  with  the  needs  of  the  hiring  organization. 

For  tests  with  demonstrated  validity  and  reliability,  several  methods  can  be  used  to  investigate 
the  test’s  usefulness  and,  in  turn,  set  a  passing  score  at  a  point  that  identifies  future  successful 
employees.  Three  of  these  methodologies  are  described  below:  (1)  Taylor-Russell  tables,  (2) 
expectancy  tables,  and  (3)  contingency  or  sensitivity  tables. 

Taylor-RussellTables  —  The  Taylor-Russell  tables  allow  a  researcher  to  estimate  the  percentage  of 
new  employees  who  will  be  successful  on  the  job  if  the  test  is  used  (Taylor  &Russell,  1939).  To  use 
these  tables  one  must  know  the  following — 

1.  validity  coefficient  for  the  test  (e.g.,  R=.70), 

2.  selection  ratio,  and 

3.  base  rate. 

The  validity  coefficient ( correlation  between  test  and  criterion  measure)  can  be  obtained  through  a 
criterion-related  validity  study  or  by  estimating  the  test  validity  through  past  research  for  similar 
methods  and  studies  (e.g.,  validity  generalization).  The  selection  ratio  is  the  percentage  of  applicants 
an  organization  must  hire  to  fill  vacant  jobs  in  relation  to  the  total  number  of  applicants  (i.e.,  num¬ 
ber  of  openings/number  of  applicants  for  the  position). 

Calculation  of  the  base  rate  involves  two  steps.  First,  the  number  of  current  employees  who  were 
hired  prior  to  the  use  of  the  test  is  determined.  Second,  the  percentage  of  these  current  employees 
who  demonstrate  successfuljob  performance  is  computed.  One  method  to  obtain  this  number  is  to 
set  a  criterion  for  successfuljob  performance  and  determine  which  employees  are  above  it  and 
below  it.  For  example,  if  a  tire  manufacturer  requires  that  a  tire  builder  produce  50  tires  per  shift, 
those  employees  who  produce  50  or  more  are  considered  successful  and  those  who  produce  fewer 
are  unsuccessful.  Therefore,  if  there  are  100  tire  builders  and  60  produce  50  or  more  tires  per  day, 
the  base  rate  is  .60  (i.e.,  6"/ioo  =  .60). 

After  the  validity,  selection  ratio,  and  base  rate  are  determined,  the  Taylor-Russell  tables  are 
examined.  For  illustrative  purposes,  the  Taylor-Russell  .60  base  rate  table  is  presented  in  Table  6.1. 
This  .60  table  is  used  because  60%  of  existing  workers  who  did  not  take  the  test  are  performing 
successfully  on  thejob.  Using  the  tire  builder  example,  we  assume  the  physical  performance  test  had 
a  test  validity  coefficient  of  .70  and  the  selection  ratio  was  ,50.  Based  on  these  data, Table  6.1  shows 
that  84%  (.84)  of  the  applicants  who  pass  the  test  and  are  hired  would  be  successful  on  the  job.  This 
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Table  6. 1  Example  of  Taylor-Russellfable  for  base  rale  .60 


Selection  Ratio 

Validity 

.05 

.10 

.40 

.80 

— 

.90 

.95 

.00 

.60 

.60 

.60 

.60 

.60 

.60 

.10 

.67 

.65 

.64 

.64 

.63 

.63 

.62 

.61 

.61 

.60 

.20 

.71 

.69 

.67 

.66 

.65 

.64 

.63 

.62 

.61 

.30 

.79 

.76 

.69 

n 

.64 

.62 

.61 

.40 

m 

.66 

.63 

.62 

.50 

.82 

— 

.73 

.70 

.67 

.64 

.60 

.96 

.94 

.87 

.83 

.80 

.76 

.73 

.69 

.65 

.63 

.70 

.99 

.91 

.87 

.71 

.66 

.63 

.80 

1.00 

.98 

.95 

.92 

.88 

.83 

.78 

.72 

.66 

.63 

.90 

1.00 

1.00 

1.00 

.99 

I 

.88 

.82 

.74 

.67 

.63 

1.00 

1.00 

1.00 

1.00 

1.00 

1.00 

.86 

.75 

.67 

.63 

ExpectancyTables — An  expectancy  table  shows  the  level  of  job  performance  (i.e.,  criterion  measure) 
expected  based  on  a  particular  physical  performance  test  score.  This  table  indicates  the  probability  of 
different  job  performance  outcomes  for  individuals  obtaining  specific  test  scores.  Expectancy  table 
values  are  determined  from  data  generated  in  a  criterion-related  validity  study  in  which  workers  are 
evaluated  on  the  test  and  job  performance.  This  approach  allows  one  to  predict  a  future  worker's  level 
of  job  performance  when  various  minimum  passing  scores  are  applied  to  an  applicant  population. 

Expectancy  tables  are  generated  by  ranking  the  workers  on  a  single  test  score  or  a  combination  (e.g., 
sum)  of  multiple  test  scores.  The  distribution  of  test  scores  is  divided  into  lOequal  intervals,  with  each 
interval  containing  approximately  the  same  number  ofworkers.  The  mean  criterion  score  (i.e.  Job  per¬ 
formance)  at  each  10%interval  distribution  of  test  scores  is  calculated  to  produce  an  expectancy  table. 

A  sample  expectancy  table  is  presented  in  Table  6.2.  This  table  lists  (l)the  mean  test  score  at 
each  10%interval,  (2)  the  percent  of  incumbents  who  attained  the  mean  test  score  or  higher  at  each 
interval,  and  (3)  the  expectedjob  performance  (criterion  measure)  for  the  incumbents  who  met  or 
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Table  6.2  Combined  physical  performance  test  battery1  expectancy  table 


Combined  Physical  Performance  Test  Battery  Expectancy  Table 

%  Level 

Test  Score 

%  Incumbents  Scare  Higher  than  TestMeanl  (n  =  87) 

Job  Performance 

00 

1000 

320 

10 

431  7 

90  a 

381 

20 

4551 

805 

45  3 

30 

483  a 

701 

531 

40 

5187 

59  a 

58  2 

50 

5500 

50  6 

63  2 

60 

5695 

41  4 

673 

70 

595.7 

31.0 

71.6 

80 

605.9 

21.8 

75.8 

90 

647.3 

10.3 

78.9 

1 .  Unequal  percent  of  incumbents  per  interval  due  to  unequal  n. 


exceeded  the  test  score.  The  mean  test  score  for  the  first  10%  interval  is  431.7.  At  this  level  90.8% 
of  the  incumbent  workers  achieved  a  test  score  of  431.7  or  greater.  This  means  that  9.2%  (i.e., 
100%  —  90.8%  =  9.2%)  of  the  incumbents  failed  the  test.  Similarly,  if  a  minimum  passing  score 
of  455.1  was  selected,  80.5%  of  the  incumbents  would  pass  the  test  and  19.5%  would  fail. 

Table  6.2  also  shows  the  impact  of  test  performance  on  job  performance.  For  example,  the  job 
performance  measure  used  in  this  table  is  supervisor/peer  ratings  of  performance  for  10  essential 
tasks  in  which  a  total  of  30  points  defines  meeting  job  requirements.  If  an  applicant  attained  a  test 
score  of  431.7,  the  expected  job  performance  would  be  38.1  or  8.1  points  above  the  minimum 
acceptable  level.  If  a  score  of  455.1  were  achieved,  the  average  job  performance  would  be  45.3 
points  or  15.3  points  above  the  minimum. 

Of  interest  is  the  increase  injob  performance  at  different  levels.  In  Table  6.2,  the  mean  job  per¬ 
formance  for  the  entire  sample  of  workers  was  32  points,  which  is  an  acceptable  job  performance 
score.  However,  there  were  workers  in  this  sample  who  were  unacceptable.  Therefore,  the  increase 
in  job  performance  from  the  mean  (32)  to  a  test  score  of  431.7  (10%  level)  was  6.1  points 
(38.1  -  32.0).  Similarly,  the  increase  in  job  performance  between  a  test  score  of  431.7  (10%  level) 
and  455.1  (20%  level)  was  7.2  points. These  data  indicate  that  the  20%  level  score  provides  a  greater 
increase  in  predicted  job  performance,  and  therefore,  a  more  successful  worker  than  the  10%level. 
However,  it  is  necessary  to  investigate  whether  the  20%  level  results  in  more  or  less  accurate  deci¬ 
sions  related  to  individuals  who  pass  the  test  and  have  acceptablejob  performance  and  those  who 
fail  the  test  and  are  unacceptable  on  the  job.  Determining  the  accuracy  of  a  potential  passing  score 
is  accomplished  through  the  use  of  contingency  tables. 

ContingencyTables —  Contingency  tables  are  used  to  determine  the  accuracy  of  a  proposed  pass¬ 
ing  score  in  classifying  workers  as  acceptable  and  not  acceptable  based  on  their  test  scores.  If  a  test¬ 
ing  procedure  were  perfect,  all  individuals  who  passed  the  test  would  be  successful  on  the  job  and 
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all  those  who  did  not  would  be  unsuccessful.  However,  testing  procedures  are  rarely  perfect,  and 
errors  occur  in  the  classification  of  individuals.  One  of  these  errors  is  classifying  successful  workers 
as  unsuccessful  based  on  their  test  scores.  This  is  called  a  false  negative  because  the  workers  who 
scored  below  a  designated  passing  score  would  perform  acceptably  on  the  job.  The  second  error, 
false  positive,  occurs  when  an  individual's  test  score  indicates  or  predicts  successfuljob  perform¬ 
ance,  but  in  realityjob  performance  is  unacceptable  (Safrit  &  Wood,  1989).  These  classifications 
can  be  used  to  calculate  selection  ratios  which  indicate  how  accurately  different  test  scores  identi¬ 
fy  individuals  whose  job  performance  is  acceptable  and  unacceptable. 

Figure  6.4  illustrates  a  distribution  of  test  scores  in  relation  to  job  success.  The  four  quadrants 
contain  two  types  of  acceptances  and  rejections — 

1.  false  rejections  in  which  workers  who  failed  the  test  were  deemed  to  have  acceptablejob  performance, 

2.  true  acceptances,  workers  who  scored  well  on  the  test  and  \yere  acceptable  on  the  job, 

3.  true  rejections,  workers  who  scored  poorly  on  the  test  and  had  unacceptable  job  performance,  and 

4.  false  acceptances,  workers  who  scored  well  on  the  test  and  had  unacceptable  job  performance. 


E 

o 


Test  Scores 


Fail 

Pass 

(False  Rejection  -  2) 

(True  Acceptance  -  18) 

•  •  •  * 

•  • 

•  •  .  •  *  * 

•  •  •  • 

•  •  •  • 

(True  Rejection  -  8) 

(False  Acceptance  -  2) 

•  • 

•  • 

•  • 

•  • 

•  • 

1 

Figure  6.4  Examples  of  contingency  table 

The  selection  ratio,  or  percentage  of  time  that  oh* 
ming  the  number  of  true  acceptances  and  true  rejectic 
inees.  That  is — 


bxpects  to  be  accurate,  is  calculated  by  sum- 
ls  and  dividing  by  the  total  number  of  exam- 


Correct  Decisions  = 

_ (True  Acceptances-\-True  Rejections) _ 

( True  Acceptances~\-True  Rejections-)- False  Acceptances-)- False  Rejections) 


Using  the  data  presented  in  Figure  6.4,  the  percentage  of  correct  decisions  would  be  86.7  per¬ 
cent.  With  this  approach,  the  accuracy  of  several  potential  test  passing  scores  as  predictors  of  per¬ 
formance  for  the  data  in  Figure  6.4  can  be  investigated. 

Correct  Decisions  =  0.867  = 
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The  previous  example  of  an  expectancy  table  for  the  tire  builder  job  showed  that  either  a  score 
of  431.7  (10%level)  or  a  score  of  455.1  (20%  level)  provided  substantial  increases  in  job  perform¬ 
ance  (Table  6.2).  Table  6.3  shows  that  for  individuals  who  achieve  a  test  score  of  431.7  or  higher, 
90%  would  have  acceptable  job  performance  (true  acceptances)  and  10%  would  not  (false  accept¬ 
ances).  For  those  individuals  whose  test  score  was  below  431.7,  80%  were  deemed  to  have  unac- 
ceptablejob  performance  (true  rejections)  and  20%  had  acceptablejob  performance  (false  rejections). 
The  accuracy  of  the  pass/feil  ratios  or  true  acceptance  and  true  rejections  for  the  score  of  455.1  are 
similar  to  the  score  of  431.7  (i.e.,  91%,  72%,  respectively).  However,  for  a  score  of  483.8  (30%  level) 
the  true  acceptance  ratio  (93%)  remains  high,  but  the  true  rejections  drop  dramatically  (51%).  This 
drop  indicates  that  49%  of  the  individuals  who  scored  below  483.8  and  failed  the  test  were  deemed 
to  have  acceptablejob  performance.  Therefore,  use  of  a  passing  score  of  483.8  would  not  be  reason¬ 
able  because  it  would  eliminate  many  individuals  who  could  perform  the  job.  Setting  the  passing 
score  at  455.1  or  431.7  would  provide  a  more  accurate  assessment  of  job  success. 


Table  6.3  Comparison  of  test  score  to  acceptable  and  unacceptable  job  performance 


Job  Performance  Level  | 

GE  those  incumbents  obtaining 
specilic  test  score  the  expected 
job  performance  would  he: 

Of  those  incumbents  tailing  at 
a  specific  test  score  the  expected 
job  performance  would  be: 

Minimum  Test  Passing  Score 

Unacceptable  % 

Acceptable  % 

Unacceptable  % 

Acceptable  % 

If  no  test  used 

21 

79 

- 

- 

431.7  (10%) 

10 

90 

ao 

20 

455 1  (20%) 

9 

91 

72 

28 

483  8(30%) 

7 

93 

51 

49 

518  7(40%) 

4 

96 

44 

56 

550  0  (50%) 

2 

98 

34 

66 

5695(60%) 

3 

97 

29 

71 

5957(70%) 

4 

96 

24 

76 

605.9  (80%) 

6 

94 

21 

79 

647  3  (90%) 

11 

89 

18 

84 

Combined  Use  of  Expectancy  and  Contingency  Tables — To  determine  which  score  to  use  as  the 
passing  score,  the  expectancy  table  (Table  6.2)  can  be  consulted  to  determine  which  score  resulted 
in  a  greater  increase  in  predicted  job  performance.  Table  6.2  shows  that  the  increase  injob  perform¬ 
ance  at  the  lOpercent  level  (431.7)  is  6.1  points  and  the  increase  at  the  20  percent  level  (455.1)  is 

7.2  points.  The  greatest  increase  in  job  performance  of  7.8  point  is  found  between  the  20  percent 
(455.1)  and  30  percent  (483.8).  However,  the  accuracy  of  true  rejections  (i.e.,  51%)  as  shown  in  Table 

6.3  is  lower  than  the  20  percent  level  (i.e.,  72%).  Therefore,  since  the  decision  accuracy  for  the  10 
percent  (80  true  rejections;  90%  true  acceptances)  and  20  percent  (72%  true  rejections;  91%  true 
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acceptances)  levels  is  comparable,  the  greatest  utility  would  be  achieved  if  the  passing  score  were  set 
at  the  20  percent  level  or  a  passing  score  of  455.1  to  obtain  a  greater  expectedjob  performance  level. 


Normative  Data 

Scores  from  standardized  or  widely  administered  physical  tests  (e.g.,  sit-ups)  can  be  used  to 
compare  one  group  to  the  original  reference  group.  Such  a  comparison  would  determine  an  indi¬ 
vidual’s  status  relative  to  his/her  normative  group.  This  type  of  testing  and  use  of  test  scores,  which 
uses  previously  gathered  data  for  comparison,  is  called  norm-referenced  testing  (Nitko,  1984;Safrit 
&Wood,  1989).  When  using  norm-referenced  testing,  information  about  an  individual’s  position 
relative  to  the  normative  group  can  be  expressed  in  percentile  ranks,  normalized  standard  scores,  or 
a  descriptive  classification  such  as  “good”  or  “average.”  Although  other  scoring  methods  are  used 
with  norm-referenced  tests,  these  are  the  most  common  encountered  in  physical  testing. 

Norm-referenced  testing  can  be  useful  in  determining  an  individual’s  ability  relative  to  others 
in  the  same  reference  group.  For  example,  a  30-year  old  woman  who  obtains  a  score  of  18  inches 
on  a  flexibility  test  is  considered  “average,” with  a  percentile  rank  of  45  when  compared  with  other 
women  of  the  same  age  (Golding,  Myers,  &Sinning,  1989).  A  30-year  old  man  with  the  same  flex¬ 
ibility  score  would  be  considered  “excellent”  with  a  percentile  rank  of  75  when  compared  to  men  in 
the  same  age  group.  If  the  wrong  reference  group  (e.g.,  30-year  old  women)  were  used,  this  man 
would  be  misclassified.  Use  of  the  correct  reference  group  is  fundamental  to  classifying  individuals 
accurately  based  on  a  testing  procedure. 

Most  normative  physical  performance  test  data  are  collected  on  the  “general population”  by  age 
and  gender.  Classifying  individuals  into  above  average,  average,  and  fair  groups  may  provide  gener¬ 
al  information  about  an  individual’s  relative  standing.  However,  for  employment  purposes  this  clas¬ 
sification  method  is  inadequate.  A  passing  score  must  represent  the  absolute  level  of  performance 
needed  for  successfuljob  performance.  To  merely  identify  a  percentile  level  (e.g.,  40%)  and  indicate 
that  anyone  who  scores  below  this  level  fails  the  test  is  inadequate  for  making  employment  decisions, 
because  it  does  not  establish  the  relevance  of  the  passing  score  (e.g.,  40%)  to  futurejob  performance. 

To  use  the  information  obtained  from  a  norm-referenced  test  in  an  employment  setting,  two 
criteria  should  be  applied.  First,  the  normative  data  must  be  drawn  from  a  sample  population  rep¬ 
resentative  of  the  prospective  job  population.  It  would  be  inappropriate  to  use  a  percentile  rank 
(e.g.,  40%)  from  normative  general  population  data  as  the  acceptable  performance  level  to  deter¬ 
mine  whether  an  applicant  for  a  police  officer  position  passes  or  fails  the  sit-ups  test.  The  norma¬ 
tive  data  for  such  a  determination  should  be  based  on  a  normative  police  officer  population. 
Second,  the  score  (e.g.,  percentile)  selected  as  the  passing  score  must  represent  the  level  of  per¬ 
formance  needed  for  the  job. 

Finally,  it  should  be  noted  that  the  Civil  Rights  Act  of  1991  (Section  106)  prohibits  use  of 
adjusted  or  different  test  passing  scores  based  on  race,  gender,  color,  religion,  or  national  origin.  Use 
of  different  standards  to  compute  norm-referenced  scores  for  individuals  in  different  groups  (e.g., 
percentiles,  T-scores)  is  unlawful  in  a  selection  setting.  However,  recent  litigation  (Lanning  v. 
SEPTA,  1999)  has  challenged  the  use  of  a  single  passing  score  in  favor  of  separate  passing  scores 
based  on  gender  for  tests  that  are  assessing  physical  fitness. 
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Ergonomic  Data 


Ergonomic  or  workplace  data  such  as  the  energy  cost  of  the  work  being  performed,  weights  and 
dimensions  of  objects  being  lifted,  distances  objects  are  carried,  and  forces  required  to  operate 
equipment  have  been  used  to  establish  passing  scores  for  physical  tests.  Because  passing  scores 
should  be  set  at  a  level  that  ensures  minimally  acceptable  performance  (Cascio,  Outtz,  Zedeck,  6t 
Goldstein,  1991;  Cascio,  Alexander,  6c  Barrett,  1988),  ergonomic  data  help  define  the  minimal  job 
demands.  Ergonomists  and  physiologists  have  defined  the  energy  costs  (e.g.,  oxygen  consumption, 
kilocalories  per  minute)  and  strength  demands  of  work  for  job  tasks  ranging  from  light  industrial 
to  heavy  tasks  involved  in  firefighting,  coal  mining,  and  other  manual  materials  handling  jobs 
(Astrand  6c  Rodahl,  1986;  Ayoub,  1991). These  ergonomic  parameters  have  been  used  to  match 
the  worker  to  the  job  demands,  as  well  as  to  modify  the  workplace  design. 

Research  using  ergonomic  data  to  identify  the  energy  costs  of  fighting  fires  and  the  forces 
required  to  lift  patient-loaded  gurneys  into  an  ambulance  has  shown  that  job-related  passing  scores 
can  be  identified  for  aerobic  capacity  and  strength  tests.  A  study  conducted  using  experienced  fire¬ 
fighters  indicated  that  a  VC^max  of  33.5  ml’kg'^min  1  was  needed  to  meet  the  demands  of  a 
sequence  of  essential  firefighting  tasks  (Sothmann,  Saupe,Jasenof,  Blaney,Fuhrman,  Woulfe,  Raven, 
Pawelczyk,  Dotson,  Landy,  Smith,  &Davis,  1990). In  this  study,  contingency  tables  were  generated 
based  on  this  value  and  were  used  to  classify  firefighters  as  acceptable  and  unacceptable.  These  con¬ 
tingency  tables  indicated  that  the  true  acceptances  were  83%,  and  the  true  rejections  were  67%. 
Lowering  the  acceptable  VCfymax  to  30.5  ml,kgI*min1  resulted  in  only  25%  of  the  firefighters 
being  able  to  achieve  acceptable  job  performance.  Recent  research  using  a  criterion-related  validity 
study  compared  incumbent  firefighters'  job  performance  with  several  strength  and  aerobic  capacity 
measures  and  found  that  a  VCfysubmax  of  33.5  ml*kg  ‘"min  1  was  indicative  of  acceptable  job  per¬ 
formance  (Gebhardt,  Baker,  &Sheppard,  1995;  Sothmann  et  al.,  1995). Thus,  a  firefighter  physical 
test  that  requires  a  VCfymax  of  33.5  ml*kg1*min  1  would  be  appropriate  for  selection  firefighters. 

In  another  study,  the  minimum  force  required  to  lift  the  head-end  of  an  ambulance  gurney  was 
computed  using  a  biomechanical  model.  Ergonomic  data  such  as  the  gurney  weight,  length,  width, 
and  height  at  fill  extension  were  obtained,  along  with  the  average  weight  and  gender  of  the  patients 
(Gebhardt,  1990).  A  model  using  these  variables  was  designed  to  determine  the  force  required  to 
lift  the  head-  and  foot-ends  of  the  gurney.  The  model  indicated  that  the  minimally  acceptable  force 
to  lift  the  head-end  of  the  patient-loaded  gurney  from  the  ground  to  full  extension  was  154.5 
pounds.  A  criterion-related  validity  study  conducted  with  incumbent  paramedics  used  an  instru¬ 
ment  that  simulated  the  gurney  lifting  position,  along  with  other  muscular  strength  and  endurance 
tests.  This  study  found  that  a  force  of  157.7pounds.  on  the  simulated  gurney  lift  was  the  minimum 
force  associated  with  acceptable  job  performance.  This  force  closely  approximated  the  force  value 
obtained  in  the  biomechanical  model. 

Clearly,  ergonomic  data  can  be  used  effectively  to  define  levels  of  performance.  In  addition  to 
the  examples  above,  other  measures,  such  as  production  rates,  can  be  used  to  identify  a  minimum 
level  of  performance.  However,  Jackson  cautioned  researchers  that  use  of  published  energy  cost  data 
is  problematic,  in  that  it  represents  average  values  for  the  general  population  (Jackson,  1994).  As 
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with  norm-reference  testing,  energy  costs  across  workers  (e.g.,  by  gender)  may  be  equivalent,  but 
an  individual’s  actual  work  or  power  output  may  differ. 


Test  Fairness  and  Adverse  impact 


Adverse  impact  and  test  fairness  go  hand  in  hand  when  evaluating  physical  performance  tests 
and  establishing  passing  scores.  The  EEOC  Uniform  Guidelines  (1978)  define  adverse  impact  by 
the  four-fifths  (4/5s)  or  80%  rule  in  which  the  passing  rate  for  the  minority  group  (e.g.,  women)  is 
less  than  80%  of  the  pass  rate  for  the  majority  group  (e.g.,  men).  A  detailed  discussion  of  the  com¬ 
putation  of  adverse  impact  can  be  found  in  the  Chapter  7  of  this  State  of-the-Art  Report  (SOAR). 
Adverse  impact  in  physical  testing  is  based  primarily  on  gender  (Gebhardt,  2000;  Hogan  1991).  A 
test  that  had  no  adverse  impact  might  be  considered  to  be  a  fair  test.  However,  Guion  (1966)  indi¬ 
cated  that  test  fairness  in  an  employment  setting  includes  more  than  just  lack  of  adverse  impact. 
His  definition  of  test  fairness  is  “Individuals  with  equal  probabilities  of  success  have  equal  proba¬ 
bilities  of  being  hired.” 

Adverse  Impact  of  Physical  Tests 

Past  research  has  confirmed  that  men  and  women  have  significant  strength  differences 
(Gebhardt,  1999;  Gebhardt  et  al.,  1985,1998b,  1999;  Hogan,  1991;  Jackson,  1994;  McArdle, 
Katch,  &c  Katch,  1996;  Myers,  Gebhardt,  Crump,  8c  Fleishman,  1983a;  NIOSH,  1981).  The 
greater  the  physical  demands  of  a  job,  the  less  likely  a  subgroup  of  women  will  achieve  an  80%  pass 
rate  of  men.  This  is  demonstrated  by  an  example  provided  by  Jackson  (Jackson,  2000,  Table  l)in 
which  women  and  men  are  compared  on  an  incremental  lifting  task  that  starts  at  22  pounds  and 
continues  in  1 1 -pound  increments  to  99  pounds.  Using  the  80%  or  4/5s  rule,  adverse  impact  was 
shown  at  a  lift  of  55  pounds.  These  results  are  similar  to  a  large  U.S.  Army  study  that  used  incre¬ 
mental  lifting  and  found  significant  differences  between  men  and  women  before  and  after  basic 
training  (Myers,  Gebhardt,  Crump,  8c  Fleishman,  1983b).  Since  most  physically  demandingjobs 
require  muscular  strength,  tests  that  evaluate  the  physical  abilities  needed  for  the  job  will  have  a 
muscular  strength  component.  Therefore,  it  can  be  inferred  that  most  physical  tests  will  have  an 
adverse  impact  on  women  as  defined  by  the  EEOC  Uniform  Guidelines. 

The  EEOC  Uniform  Guidelines  also  state  that  an  alternative  assessment  with  less  adverse 
impact  should  be  used.  Many  test  developers  believed  that  use  of  a  work  sample  or  job  simulation 
test  would  reduce  the  adverse  impact  observed  with  basic  ability  tests  (e.g.,  strength).  However, 
research  has  shown  that  both  have  adverse  impact.  Figure  6.5  shows  means  for  men  and  women 
across  several  basic  ability  strength  tests.  In  all  tests  men  perform  significantly  better  than  women, 
and  the  women’s  scores  as  a  percentage  of  the  men’s  ranged  from  59%  to  67%  (Gebhardt,  1999). 
Similarly,  Figure  6.6  illustrates  that  women’s  performance  on  job  simulation  tests  as  a  proportion 
of  men’s  performance  is  similar  to  that  shown  for  basic  ability  tests  in  Figure  6.5.  For  example,  the 
women’s  means  for  job  simulation  tests  such  as  setting  and  climbing  ladders,  applying  sidewall  to 
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Figure  6.5  Basic  ability  tests:  Women’s  performance  as  percent  of  men’s.  From  “Applying  regression  analy¬ 
sis  and  biomechanical  modeling  to  setting  of  cut-scores  or  qualifying  standards,  ”  by  D.L  Gebhardt,  1999, 
paper  presented  at  American  College  of  Sports  Medicine,  Seattle,  WA.  Reprinted  with  permission  of  author. 

tires,  and  hanging  14-foot  rods  on  containers,  all  of  which  required  upper  body  strength,  had  sim¬ 
ilar  or  lower  percentage  proportions  to  the  basic  ability  upper  body  strength  tests  (e.g.,  arm  lift). 
Therefore,  use  of  a  job  simulation  may  not  be  an  effective  alternative  assessment  procedure  as 
defined  by  the  EEOC  Uniform  Guidelines. 

Finally,  the  magnitude  of  the  difference  between  mean  scores  for  men  and  women  on  physical 
tests  is  large.  The  statistical  method  to  determine  the  size  of  these  differences  is  called  effect  size, 
with  a  value  of  0.8  or  greater  being  defined  as  a  large  effect  size  (Cohen,  1988).  The  effect  sizes 
found  in  a  variety  of  studies  ranged  from  ,8  to  1.5  (e.g.,  Baker,  Sheppard,  Gebhardt,  6c  Leonard, 
2000;  Gebhardt  et  al.,  1998a;  Marcinko,  Nelson,  Schneider,  &  Sproule,  1997). Therefore,  the  sta¬ 
tistical  difference  between  men’s  and  women’s  physical  test  means  is  very  large,  as  found  in  many 
studies  (Gebhardt  et  al.,  1998b,  1999,  Reilly,  Zedeck,  SoTenopyr,  1979;  Sothmann  et  al.,  1995). 


Test  Fairness 


To  use  a  test  to  predict  applicants’  futurejob  performance,  the  test  must  valid.  Past  research  has 
shown  that  physical  tests  do  indeed  have  high  levels  of  validity  (e.g.,  Gebhardt,  1985,1998a  8c  b, 
1999;Jackson,  1994;  Myers  et  al.,  1983;  Reilly  et  al.,  1979;  Sothmann  et  al.,  1995).  Even  though 
an  adverse  effect  impact  on  women  was  found  in  these  studies  (e.g.,  Gebhardt  1999;  Sothmann  et 
al.,  1995),  it  does  not  necessarily  follow  that  the  tests  were  discriminatory  or  in  violation  of  Federal 
statutes. These  statutes  state  that  tests  with  adverse  impact  may  be  used  if  the  employer  can  demon¬ 
strate  thejob-relatedness  and  validity  of  the  tests.  Therefore,  tests  with  adverse  impact  maybe  con- 
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Figure  6.6  Job  simulation  tests:  Women's  performance  as  percent  of  men's.  From  "Applying  regression  analy¬ 
sis  and  biomechanical  modeling  to  setting  of  cut-scores  or  qualifying  standards,”  by  D.L.  Gebhardt,  1999, 
paper  presented  at  American  College  of  Sports  Medicine,  Seattle,  WA.  Reprinted  with  permission  of  author. 

sidered  fair.  In  many  of  the  studies  cited  above,  the  tests  had  adverse  impact  on  women  but  were 
found  to  be  fair  across  subgroups  (e.g.,  gender,  race)  because  the  tests  measured  specific  abilities  or 
behaviors  needed  for  effective  job  performance.  Thus,  these  tests  are  more  likely  to  retain  validity 
across  different  subgroups  such  as  men  and  women  (Anastasi,  1996). 

To  assess  whether  a  test  is  valid  for  multiple  groups  (e.g.,  gender)  or  biased  toward  one  sub¬ 
group,  it  is  necessary  to  determine  whether — 


1.  the  validity  coefficients  or  regression  lines  are  similar  across  subgroups  and 

2.  the  test  and/or  criterion  scores  differ  across  subgroups  (Bartlett,  Bobko,  Mosier,  8c 
Hannan,  1978;Kerlinger  8cPedhazur,  1973). 

The  statistical  procedure  used  for  this  analysis  is  differential  prediction  in  which  a  moderated  mul¬ 
tiple  regression  analysis  is  employed  to  examine  whether  the  validity  coefficients  (slope)  and  test 
and/or  criterion  mean  scores  (intercept)  across  subgroups  differ  from  the  overall  regression  equation 
(Bartlett  et  al.,  1978;  Kerlinger  8t  Pedhazur,  1973). In  the  physical  test  domain,  test  fairness  analyses 
are  typically  used  to  determine  whether  a  test  is  biased  in  relation  to  gender,  ethnic,  and  age  groups. 

Figure  6.7  illustrates  the  slope  and  intercept  for  the  common  regression  line  generated  using 
the  total  sample  population,  along  with  the  regression  lines  for  example  subgroups  land  2.  If  the 
Y -intercept  of  each  subgroup  regression  line  differs  from  the  Y -intercept  of  the  common  regression 
line,  it  would  be  concluded  that  the  subgroups  differed  on  the  physical  tests  used  as  predictors 
and/or  on  the  criterion  measure.  In  the  physical  testing  arena,  the  regression  line  for  subgroup  2  is 
usually  represented  by  women's  performance,  with  subgroup  1  representing  the  men.  Studies  that 
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Figure  6.7 Differential  prediction  analysis 

investigated  test  fairness  for  physical  tests  have  found  both  significant  and  non-significant  inter¬ 
cept  differences  for  gender  (Baker  et  al.,  2000;  Gebhardt,  1985, 1998b;  Reilly  et  al.,  1979). 

The  slopes  of  the  subgroup  regression  lines  are  also  tested  for  differences  between  each  sub¬ 
group  and  the  common  regression  line.  If  significant  differences  in  the  slopes  are  found,  this  may 
be  attributed  to  variations  in  a  subgroup’s  physical  test  score  and/or  criterion  measure  score.  This 
finding  is  more  problematic  than  intercept  differences  because  the  differential  group  validity  indi¬ 
cates  that  separate  regression  equations  are  appropriate  for  each  group.  However,  the  Civil  Rights 
Act  of  1991  prohibits  adjusting  test  scores  based  on  subgroup  differences. 

Test  fairness  analysis  can  help  to  increase  test  utility  in  relation  to  adverse  impact.  Adverse 
impact  can  be  reduced  or  eliminated  by  lowering  a  passing  score.  However,  in  the  physical  domain, 
lowering  the  passing  score  to  a  point  that  eliminates  gender  adverse  impact  typically  diminishes  the 
utility  of  the  test.  Physical  tests  that  have  utility  usually  have  adverse  impact.  Although  adverse 
impact  is  undesirable,  the  Federal  statutes  indicate  that  it  is  acceptable  as  long  as  the  test  is  valid, 
job-related,  and  fair.  Use  of  the  test  fairness  analysis  satisfies  one  of  these  conditions. 


Summary 


In  this  chapter  several  methods  for  determining  passing  scores  were  discussed.  Use  of  a  specif¬ 
ic  technique  (i.e.,  expectancy  tables,  contingency  tables,  Taylor  Russell  tables,  normative  data, 
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ergonomic  data)  depends  upon  the  type  of  model  used  to  validate  the  test  (i.e.,  content,  criterion- 
related,  construct).  The  need  for  accurate  test  (predictor),  job  performance  (criterion),  and 
ergonomic  data  was  emphasized.  The  integration  of  data  from  multiple  sources,  such  as  expectan¬ 
cy  and  contingency  tables,  is  important  in  establishing  an  accurate  passing  score  that  will  reflect 
futurejob  performance.  Finally,  factors  such  as  test  fairness  and  adverse  impact  must  be  considered 
when  setting  a  passing  score  to  ensure  accurate  employment  decisions  and  compliance  with  the 
EEOC  Uniform  Guidelines  (1978). 
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Abstract 


This  chapter  is  an  overview  of  the  legal  forces  and  issues  related  to  employment  practices.  Title 
VII  of  the  Civil  Rights  Act  of  1964, the  Age  Discrimination  in  Employment  Act  (ADEA)  of  1967, 
and  Americans  with  Disabilities  Act  (ADA)  of  1990  are  the  federal  laws  that  define  discriminatory 
employment  practices.  The  centerpiece  of  employment  discrimination  law  is  Title  VII  of  the  Civil 
Rights  Act  of  1964,  as  amended  by  Congress  on  several  occasions.  Title  VII  prohibits  employment 
discrimination  because  of  “race,  color,  religion,  sex,  and  national  origin’’  by  employers,  labor  organi¬ 
zations,  and  employment  agencies.  Title  VII  tends  to  be  comprehensive  in  that  everyone  is  poten¬ 
tially  covered,  because  both  genders  and  all  majority  and  minority  racial  and  ethnic  groups,  as  well 
as  religious  groups,  are  covered  by  Title  VII,  but  the  act  does  not  apply  to  Military  personnel. 

The  disparate  impact  theory  is  used  to  establish  employment  discrimination.  This  legal  process 
has  a  three-part  burden  of  proof.  First,  the  plaintiff  (employee)  must  establish  that  the  hiring  prac¬ 
tice  has  a  disparate  impact  on  a  protected  group.  Although  not  legally  mandated,  the  Equal 
Employment  Opportunity  Commission  (EEOC)  Guidelines  are  often  used  to  define  disparate 
impact.  The  guidelines  use  the  four-fifths  (4/5s)  rule  to  define  adverse  impact.  Under  the  4/5s  rule 
a  selection  device  has  adverse  impact  when  the  pass  rate  for  one  protected  group  is  less  than  four- 
fifths,  or  80  percent,  of  the  pass  rate  of  the  group  with  the  highest  pass  rate.  Once  adverse  impact 
is  established,  the  burden  of  proof  then  falls  on  the  defendant  (employer)  to  justify  that  the  exclu¬ 
sionary  effect  is  a  business  necessity.  The  defendant  must  show  that  the  selection  method  is  job 
related.  This  involves  demonstrating  that  the  selection  device  (e.g.,  preemployment  test)  is  valid.  A 
common  method  used  to  establish  job  relatedness  is  with  a  validation  study.  Lastly,  if  business 
necessity  is  established,  the  burden  of  proof  shifts  back  to  the  plaintiff  to  demonstrate  that  the 
employer  failed  to  use  a  selection  device  that  is  equally  effective  but  has  a  lesser  disparate  impact. 
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This  chapter  reviews  cases  related  to  physical  testing.  Many  of  these  cases  involve  the  use  of 
height  and  weight  standards  and  tests  for  selecting  public  service  employees  such  as  police  officers 
and  firefighters.  The  outcome  of  this  litigation  largely  depends  on  the  scientific  quality  of  valida¬ 
tion  study.  The  recent  court  ruling  of  Lanning  v.  SEPTA  (U.S.  3rd  Circuit  1999)  will  likely  have 
a  major  impact  on  physical  testing.  An  aerobic  fitness  cut-score  representing  a  VC^max  of  42.5 
ml/kg/min  was  found  to  be  unacceptable  by  the  court.  An  option  offered  by  the  court  was  the  val¬ 
idation  of  an  aerobic  fitness  cutoff  score  that  measures  the  minimum  capacity  necessary  to  perform 
the  job.  This  court  ruling  is  consistent  with  established  physiological  and  ergonomic  principles  of 
selecting  workers  with  the  fitness  demanded  by  the  job.  This  ruling  suggests  that  validation  stud¬ 
ies  will  be  evaluated  not  only  by  standard  psychometric  criteria  but  also  by  physiological  validation 
of  the  test  and  cut-score. 


Employers  have  always  used  some  method  to  select  an  employee  from  potential  job  applicants. 
The  rapid  rise  in  the  use  of  standardized  tests  for  job  placement  can  be  traced  to  this  country’ s  need 
for  rapid  mobilization  and  use  of  human  resources  during  the  first  and  second  World  Wars.  The  goal 
was  to  match  Military  personnel  to  jobs  on  the  basis  of  test  performance.  The  development  of  pre¬ 
employment  tests  grew  out  of  the  discipline  of  psychology  and  their  early  success  in  measuring  dif¬ 
ferences  among  people.  The  common  theme  of  this  work  was  that  persons  differ  from  each  other,  in 
reasonable  stable  ways,  on  some  number  of  attributes,  and  that  patterns  of  individual  attributes  are 
more  or  less  suited  to  particular  patterns  ofjob  requirements  (Dunnette  8c  Hough,  1991). 

Much  of  the  early  preemployment  testing  focused  on  cognitive  abilities,  but  with  the  rise  in 
women  seekingjobs  that  were  once  male  dominated,  the  need  for  preemployment  physical  abilities 
tests  increased.  The  need  for  valid  tests  for  selecting  personnel  for  physically  demandingjobs  can 
be  traced  to  at  least  three  important  forces.  First,  equal  employment  opportunity  legislation  result¬ 
ed  in  greater  numbers  of  women  and  handicapped  persons  seeking  employment  in  occupations 
requiring  high  levels  of  physical  ability.  Second,  evidence  suggested  that  physically  unfit  workers 
had  higher  incidences  of  lower  back  injuries.  Lastly,  preemployment  medical  evaluations  used  alone 
are  inadequate  for  selecting  personnel  for  physically  demandingjobs.  The  disciplines  most  promi¬ 
nent  in  physical  ability  employment  testing  are  industrial-organizational  (I/O)  psychology,  indus¬ 
trial  engineering,  ergonomics,  biomechanics,  and  exercise  physiology. 

This  section  provides  an  overview  of  employment  law  as  it  relates  to  definingjob  discrimina¬ 
tion  in  the  civilian  sector  of  the  United  States.  This  section  also  provides  a  brief  overview  of  the 
Federal  laws  used  to  define  job  discrimination. This  is  followed  by  the  legal  process  used  to  describe 
discrimination.  Next,  cases  relevant  to  physical  testing  and  exercise  physiology  are  examined.  Lastly 
and  quite  hazardously,  possible  future  legal  directions  are  explored. 
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Legal  and  Legislative  Forces 


The  current  interest  and  research  on  preemployment  test  methodology  for  physically  demand¬ 
ing  jobs  has  its  roots  not  only  in  work  physiology  (Astrand  &  Rodahl,  1970;  Durnin  &  Passmore, 
1967;McArdle,  Katch,  8c  Katch,  1991;  Passmore  &  Durnin,  1955),  and  psychometric  test  theory 
(Division  of  Industrial-Organizational  Psychology  8c  Association,  1987),  but  also  in  Federal  civil 
rights  legislation  and  court  decisions  on  employment  practices.  Title  Vfl  of  the  Civil  Rights  Act  of 
1964,  the  Age  Discrimination  in  Employment  Act  (ADEA)  of  1967,  and  Americans  with 
Disabilities  Act  (ADA)  of  1990  are  the  Federal  laws  used  for  employment  litigation.  Although  Title 
VII  and  ADEA  tend  to  be  unambiguous,  ADA  has  been  found  to  be  more  difficult  to  interpret. 

The  centerpiece  of  employment  discrimination  law  is  Title  VII  of  the  Civil  Rights  Act  of  1964, 
as  amended  by  Congress  on  several  occasions.  Title  VII  prohibits  employment  discrimination  on 
the  basis  of  “race,  color,  religion,  sex,  and  national  origin’’  by  employers,  labor  organizations,  and 
employment  agencies.  The  term  “sex” refers  to  “gender”  and  does  not  include  sexual  orientation 
(Rothstein,Craver,  Shroeder,  8c  Shoben,  1999).  Title  VII  tends  to  be  comprehensive  in  that  every¬ 
one  is  potentially  covered — both  genders  and  all  majority  and  minority  racial  and  ethnic  groups,  as 
well  as  religious  groups,  are  covered  by  Title  VII.  The  act  does  not  apply  to  Military  personnel 
(Rothstein  et  al.,  1999). 

The  Age  Discrimination  in  Employment  Act  (ADEA)  in  1967provides  the  legal  basis  for  defin¬ 
ing  job  discrimination  on  the  basis  of  age.  The  substantive  provisions  of  the  ADEA  Act  are  identical 
to  Title  VII  with  the  substitution  of  the  word  “age”as  the  prohibited  basis  for  discrimination  in  place 
of  “race,  color,  religion,  sex  and  national  origin”  found  inTitleVII  (Rothstein  et  al.,  1999, p. 215). 

The  most  recent  law  used  to  define  discrimination  is  the  American  with  Disabilities  Act  (ADA) 
of  1990.  ADA  is  a  comprehensive  federal  law  that  prohibits  discrimination  in  a  wide  variety  of  seg¬ 
ments  of  life.  The  law  has  five  titles.  Title  I  covers  employment  of  Americans  with  physical  and  men¬ 
tal  disabilities  (Rothstein  et  al.,  1999). The  law  defines  a  person  with  disabilities  as  someone  with  a 
substantial  impairment  that  significantly  limits  or  restricts  a  major  life  activity  such  as  hearing,  see¬ 
ing,  speaking,  walking,  breathing,  performing  manual  tasks,  caring  for  oneself,  learning,  or  working. 

According  to  section  101(8)  of  ADA,  a  disabled  worker  is  a  person  with  a  disability  who,  with 
or  without  reasonable  accommodation,  can  perform  the  essential  functions  of  the  employment 
position  (Rothstein  et  al.,  1999).  Einployers  have  a  duty  to  make  reasonable  accommodations  to 
the  known  physical  or  mental  disability.  Some  examples  are  making  facilities  accessible! ob  restruc¬ 
turing,  acquisition  or  modification  of  equipment  or  devices.  Reasonable  accommodation  is  not 
required  if  it  results  in  undue  hardship  to  the  employer  defined  as  “an  action  requiring  significant 
difficulty  or  expense  in  light  of  factors  such  as  the  nature  and  cost  of  the  accommodation  and  the 
size  and  financial  resources  of  the  company”  (Rothstein  et  al.,  1999, p.  246). The  exact  number  of 
Americans  covered  under  this  law  is  not  known,  but  it  has  been  estimated  to  exceed  43  million. 

A  key  issue  with  the  ADA  is  that  the  person  must  be  legally  disabled.  What  substantially  lim¬ 
its  a  major  life  activity,  the  legal  definition  of  a  disability,  is  currently  defined  in  the  courts.  In  1999 
the  Supreme  Court  ruled  in  two  cases  that  having  a  condition  that  is  correctable  is  not  considered 

a  disability.  In  the  case  of  Sutton  v.  United  Airlines,  Inc.[ _ U.S. _ ,  1 19  S.Ct,  2139,  1 44  L. Ed. 2d 

450  (1999)]  the  effect  of  eyeglasses  on  vision-impaired  plaintiffs  should  be  considered  when  defin- 
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mg  a  disability  under  the  ADA.  In  the  companion  case  of  Murphy  v.  United  Parcel  Service,  Inc., 
the  Court  ruled  that  in  evaluating  the  severity  of  the  plaintiffs  hypertension,  the  effect  of  his  med¬ 
ication  should  be  considered  [ _ U.S. _ ,  1 19S.Ct.  2133,144  L.Ed.2d  450  (1999)]. These  two  rul¬ 

ings  showed  that  the  burden  of  documenting  a  disability  clearly  rests  with  the  employee  and  that 
an  employee  who  fails  to  control  a  controllable  disability  may  lose  his  or  her  protection  under  ADA 
(Rothstein  et  al.,  1999). 

Under  ADA,  preemployment  medical  examinations  and  medical  inquires  are  illegal.  Although 
preemployment  medical  examinations  may  not  be  given,  a  post-offer  medical  examination  is  per¬ 
mitted  and  a  conditionaljob  offer  may  be  withdrawn  if  the  examination  documents  that  the  appli¬ 
cant  is  unable  to  perform  the  essential  functions  of  thejob.  Medical  qualification  for  ajob  is  a  two- 
step  process.  First,  the  physical  and  mental  demands  of  thejob  must  be  documented.  Second,  the 
medical  examination  must  evaluate  the  applicant's  capacity  to  perform  the  essentialjob  functions. 
In  addition,  medical  examinations  can  be  used  to  determine  if  a  person  is  physically  able  to  return 
to  work  after  a  disability  leave  (Rothstein  et  al.,  1999). 

In  contrast  to  a  general  medical  examination,  a  preemployment  skill  or  physical  ability  test  can 
be  legally  used  for  employee  selection.  Under  ADA  such  a  test  is  not  considered  a  medical  exami¬ 
nation.  However,  if  the  preemployment  test  screens  out  applicants  on  the  basis  of  disability  defined 
under  ADA,  the  employer  has  the  burden  of  proving  that  the  test  is  job  related  and  consistent  with 
business  necessity  (Rothstein  et  al.,  1999). 


Legal  Process— Discrimination  Litigation 


The  disparate  impact  theory  is  used  to  establish  discrimination  under  Title  VII,  ADEA  and 
ADA.  This  legal  process  has  a  three-part  burden  of  proof — 

1.  The  plaintiff  (employee)  must  establish  a  disparate  impact  on  a  protected  group. 

2.  If  disparate  impact  on  a  protected  group  is  established,  the  defendant  (employer)  must 
then  justify  the  exclusionary  effect  with  a  business  necessity.  The  defendant  must  show 
that  the  selection  method  isjob  related. 

3.  If  business  necessity  is  established,  the  burden  of  proof  shifts  back  to  the  plaintiff  to 
demonstrate  that  the  employer  failed  to  use  a  selection  device  that  is  equally  effective  but 
has  a  lesser  disparate  impact. 


Disparate  Impact 

In  order  to  find  an  employment  selection  method  discriminatory,  the  plaintiff  (i.e.,  the  job 
applicant  or  affected  employee)  must  establish  that  the  method  has  disparate  impact  on  a  protect¬ 
ed  group.  The  plaintiff  must  show  that  the  employment  selection  method  adversely  affects  the 
employment  opportunities  based  on  race,  color,  religion,  sex,  national  origin,  age,  or  a  qualified  dis¬ 
ability.  The  Supreme  Court  case  Griggs  v.  Duke  Power  Co.(401  U.S.  424,  9 1  S.Ct.  849,  28)  was 
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the  first  case  to  use  a  disparate  impact  theory  of  discrimination.  The  power  company  used  stan¬ 
dardized  aptitude  tests  for  assigningjobs.  The  plaintiff  class  showed  that  the  aptitude  tests  adverse¬ 
ly  affected  racial  groups.  While  58  percent  of  whites  passed  the  test,  only  6  percent  of  African- 
Americans  passed.  The  courts  ruled  that  under  Title  VII,  employment  tests  with  disparate  impact 
could  not  be  used  unless  they  were  job  related  (Rothstein  et  al.,  1999). 

Although  disparate  impact  or  adverse  impact  must  be  proved  to  have  discrimination,  the  Federal 
laws  do  not  define  explicitly  what  constitutes  it.  In  1966  the  Equal  Employment  Opportunity 
Commission  (EEOC)  published  the  first  set  of  guidelines  on  employment  testing  that  were  revised 
in  1970.  This  led  in  1978  to  the  publication  of  the  Uniform  Guidelines  on  Employee  Selection 
Procedure!  (EEOC,  1991). These  Federal  standards  and  rules  were  jointly  agreed  on  by  the  EEOC, 
Civil  Service  Commission,  and  Departments  of  Labor  and  Justice.  The  EEOC  Guidelines  use  the 
four-fifths  (4/5s)  rule  to  define  adverse  impact.  This  tends  to  be  a  rule  of  thumb  used  by  the  EEOC 
and  Federal  enforcers  of  employment  law  to  define  adverse  impact.  Under  the  4/5s  rule  a  selection 
device  has  adverse  impact  when  the  pass  rate  for  one  protected  group  is  less  than  four-fifths,  or  80 
percent,  of  the  pass  rate  of  the  group  with  the  highest  pass  rate.  To  illustrate,  assume  the  pass  rate 
for  the  highest  group  is  60  percent  and  the  pass  rate  of  a  protected  group  is  30  percent.  In  this  exam¬ 
ple,  the  pass  rate  of  the  protected  group  is  50  percent  (30%  +  60%  =  50%)  of  the  highest  group. 
Under  the  EEOC  Guidelines  this  would  constitute  disparate  impact  because  it  is  below  the  80  per¬ 
cent  standard.  If  the  pass  rate  of  the  protected  group  was  41  percent,  while  the  pass  rate  of  the  high¬ 
est  group  was  50  percent,  the  selection  device  would  not  have  adverse  impact.  The  pass  rate  of  the 
protected  group  would  be  82  percent  (41%  -F  50%  =  82%),  above  the  80  percent  standard  defined 
in  the  EEOC  Guidelines.  Rothstein  (Rothstein,  1999) points  out  that  trial  courts  need  not  adhere 
to  the  4/5s  rule,  but  legal  history  shows  that  the  4/5s  rule  is  viewed  favorably  by  the  courts. 

When  physical  tests  are  used  for  employment  decisions,  sex  tends  to  be  a  source  of  adverse 
impact  (Hogan  8c  Quigley,  1986;  Hogan,  1991).  This  potential  for  adverse  impact  can  be  traced  to 
the  well-documented  male  and  female  differences  in  strength  (Baumgartner  8c  Jackson,  1999; 
Golding,  Meyers,  &  Sinning,  1989;  Hoffman,  Stouffer,  8cJackson,  1979;Laubach,  1976;NIOSH, 
1977),  maximal  oxygen  uptake  (VC^max)  (Astrand8c  Rodahl,  1970;  Golding,  Meyers,  &Sinning, 
1989;Jackson,  Beard,  Wier,  Ross,  Stuteville  &Blair  etal.,  1 995 ;Jackson,  Wier,  Ayers,  Beard,  Ross, 
Stuteville,  8c  Blair  et  al.,  1996;  Vogel,  Patton,  Mello,  8c  Daniels,  1986),  and  percent  body  fat 
(Jackson  &Pollock,  1978;Jackson,  Pollock, &Ward,  1980;McArdle,  Katch,  8cKatch,  1991;Vogel 
et  al.,  1986;  Wilmore  8c  Costill,  1994). 

Although  much  of  the  reason  for  adverse  impact  in  physical  testing  can  be  traced  to  physiologi¬ 
cal  differences  between  men  and  women,  another  factor  is  the  physical  demands  of  the  job.  The  more 
physically  demanding  the  job,  the  more  likely  a  test  will  fail  the  EEOC  Guidelines  4/5s  rule.  Table 
7.1  illustrates  this  with  lifting  data  obtained  on  608  women  and  men  in  the  Human  Factors  lab  at  the 
University  of  Houston.  The  lift  task  was  the  common  floor-to-knuckle  height  lift  that  became  pro¬ 
gressively  heavier  (Jackson  &  Sekula,  1999).  The  goal  was  to  continue  to  lift  heavier  loads  until  the 
weight  became  too  heavy  to  lift.  Table  7.1  gives  the  percentages  of  men  and  women  who  could  lift 
loads  between  15  and  45  kg  (33  to  99  pounds).  The  data  in  Table  7.1  illustrate  that  there  would  not 
be  adverse  impact  based  on  the  EEOC  Guidelines  4/5s  rule  for  lift  loads  of  <  25  kg,  but  would  have 
adverse  impact  for  lift  loads  >  30  kg.  Further  data  in  Table  7. 1  show  that  as  the  physical  demands  of 
the  task  increase,  the  more  likely  it  is  that  the  task  would  have  legally  disparate  impact. 
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to  promote  workers  to  more  skilled  positions  and  ruled  that  the  validation  studies  were  inadequate 
in  several  respects  under  the  EEOC  Guidelines  (Rothstein  et  al.,  1999). 

Although  the  law  does  not  require  an  employer  to  follow  the  procedures  outlined  in  the  EEOC 
Guidelines  to  establish  job  relatedness,  the  failure  to  do  so  encourages  litigation.  The  EEOC 
Guidelines  lists  three  acceptable  types  of  validity  studies.  These  are  as  follows — 

1.  Criterion-related  validity  is  established  with  empirical  evidence  (e.g.,  correlation  coeffi¬ 
cients)  linking  a  test  and  job.  The  goal  is  to  demonstrate  that  the  test  predicts  important 
elements  of  work  behavior. 

2.  Content  validity  is  established  by  replicating  major  portions  of  the  job.  The  goal  is  to 
develop  a  test  that  is  a  representative  sample  of  the  behaviors  of  the  job. 

3.  Construct  validity  demonstrates  that  a  test  measures  identifiable  traits  or  characteristics 
important  for  successful  job  performance.  The  goal  is  to  show  that  the  test  accurately 
measures  the  construct  and  that  the  construct  is  necessary  for  successfuljob  performance. 

The  EEOC  Guidelines  require  that  not  only  must  the  test  be  validated,  but  also  the  cutoff  score 
must  be  validated.  To  illustrate,  Harless  v.  Duck  was  a  class  action  that  challenged  the  physical  abil¬ 
ity  test  used  by  the  Toledo,  Ohio,  Police  Department.  Applicants  were  required  to  complete  three 
of  the  following  four  tests:  15  push-ups;  25  sit-ups;  6-foot  standing  broad  jump;  and  a  25-second 
obstacle  course.  The  Sixth  Circuit  endorsed  the  need  for  fitness  but  concluded  that  there  is  no  jus¬ 
tification  in  the  record  for  the  type  of  exercises  chosen  or  the  passing  marks  for  each  exercise. 
(Rothstein  et  al.,  1999)  In  a  1999  case  (Lanning  v.  SEPTA)  argued  at  the  U.S.  Court  of  Appeals 
(3rJ  Circuit)  a  VC^max  cut-score  of  42.5  ml/kg/min  was  found  to  be  discriminatory  under  a  dis¬ 
parate  impact  theory  of  liability  for  selecting  police  officers  for  a  regional  mass  transit  authority. 
Although  evidence  was  presented  showing  that  VC^max  was  significantly  correlated  with  job  per¬ 
formance  measured  by  arrest  records,  the  Court  ruled  that  the  cut-score  was  discriminatory  under 
the  Civil  Rights  Act  of  1991  because  the  validation  study  did  not  establish  that  this  was  the  min¬ 
imum  qualification  necessary  for  successful  performance  of  the  job. 

Alternative  Selection 

If  the  plaintiff  establishes  that  a  test  or  selection  method  has  disparate  impact,  but  the  defen¬ 
dant  proves  the  validity  of  the  method,  the  plaintiff  can  still  prove  discrimination  under  the  dis¬ 
parate  impact  theory  of  discrimination  by  demonstrating  that  there  was  a  less  discriminatory  alter¬ 
native.  The  less  discriminatory  alternative  was  established  in  Albemarle  Paper  Co.  v.  Moody  case 
(422  U.S.  at  405,  425,  95  S.Ct.  At).  The  Court’s  explanation  was  as  follows — 

If  an  employer  does  then  meet  the  burden  ofproving  that  its  tests  are  "job  related,  ”it  remains  open 
to  the  complaining  party  to  show  that  other  tests  or  selection  devices,  without  a  similarly  undesir¬ 
able  racial  effect,  would  also  serve  the  employer?  legitimate  interest  in  “efficient  and  trustworthy 
workmanship.  ”  Such  a  showing  would  be  evidence  that  the  employer  was  using  its  tests  merely  as  a 
"pretext" for  discrimination. 
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Less  discriminatory  alternatives  not  only  pertain  to  the  testing  device,  but  also  to  setting  the 
cut-score.  Although  Albemarle  Paper  Co.  v.  Moody  case  clearly  established  the  burden  of  proof  for 
the  alternate  selection  prong  of  the  disparate  impact  litigation,  Rothstein  et  al.  (1999)  report  that 
this  prong  is  legally  unclear  and  that  case  law  has  not  provided  much  clarification.  The  Fifth 
Circuit  Court  considered  the  concept  in  Brunet  v.  City  of  Columbus  (58  F.3d  251  5th  Cir.  1995). 
The  court  rejected  the  plaintiff's  claim  that  the  city  should  have  used  a  different  cut-score  for  a  fire¬ 
fighter  test  to  have  less  adverse  impact  on  female  firefighters.  The  finding  of  fact  in  the  case  was 
that  “there  was  not  a  substantially  equally  valid  cutoff  score  with  a  lesser  adverse  impact,  and  it  was 
not  required  to  consider  all  possible  alternative  hiring  procedures...”  (Rothstein,  1999,  p.  174). 
Similarly,  in  Smith  v.  City  of  Des  Moines  (99F.3rd  1466  8th  Cir.  1996),  the  Court  rejected  the 
age  discrimination  claim  and  ruled  that  “theplaintiff  failed  to  demonstrate  that  the  proposed  alter¬ 
native  would  actually  have  a  lesser  impact  nor  that  it  would  serve  the  city’s  legitimate  interests 
equally  well”  (Rothstein  et  al.,  1999,  p.  174). 


Arvey  and  Faley  (1988)  maintain  that  the  landmark  case  of  My  art  v.  Motorola  in  1963  was  the 
first  signal  that  the  court  system  became  involved  in  the  employment  process.  Leon  Myart,  a 
African-American  with  previous  job-related  experience,  was  refused  a  job  in  a  Motorola  plant 
because  his  score  on  a  five-minute  intelligence  test  was  too  low.  Myart  filed  a  complaint  with  the 
Illinois  Fair  Employment  Practices  Commission  and  charged  racial  discrimination.  The  Illinois 
Commission  ruled  that  Myart  be  offered  ajob  and  ruled  that  the  test  could  no  longer  be  used  for 
selection  decisions.  This  landmark  case  motivated  employers  to  develop  preemployment  tests  that 
did  not  discriminate  against  a  protected  group. 

Although  the  focus  of  this  document  is  on  preemployment  testing,  it  is  important  to  realize  that 
most  employment  legal  cases  are  for  dismissals  and  failures  to  be  promoted,  not  failure  to  hire.  Rothstein 
and  associates  (Rothstein  et  al.,  1999)point  out  that  those  who  have  been  “wronged”on  thejob  are  more 
likely  to  initiate  litigation  than  those  who  have  failed  to  be  hired.  They  further  explain  that  the  legal  sys¬ 
tem  is  less  sympathetic  to  those  who  have  failed  to  be  hired  than  those  who  lose  their  jobs. 

Hogan  and  Quigley  (1986)  provide  an  excellent  review  of  court  cases  related  to  physical  testing 
and  physical  standards.  The  cases  reviewed  include  the  use  of  height  and  weight  standards  and  phys¬ 
icalabilitytesting.  Of  the  44  cases  reviewed,  34  involved  height  and  weight  standards,  and  lOinvolved 
physical  ability  tests  for  employee  selection.  Of  these  cases,  most  (37)  involved  law  enforcement  and 
firefighter  employee-selection  procedures.  This  section  provides  cases  related  to  the  following — 

1.  The  use  of  height  and  weight 

2.  The  use  of  physical  tests  for  selecting  firefighters 

3.  Cases  related  to  lifting  and  materials  handling 

4.  Cases  related  to  the  use  of  physiological  parameters  to  define  cut-scores. 
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Chapter  7:  Legal  issues 


Height  and  Weight 


In  the  1960s,  height  and  weight  standards  were  a  condition  of  employment  for  many  public 
safety  jobs,  and  these  standards  clearly  had  an  adverse  impact  on  women.  Arvey  and  Faley  (1988) 
reported  that  in  1973,  nearly  all  of  the  nation’s  large  police  departments  had  a  minimum  height 
requirement.  The  average  requirement  was  68  inches.  More  than  90  percent  of  the  women  and  45 
percent  of  the  men  would  be  expected  to  fail  the  68-inch  height  requirement.  The  rationale  for  the 
standard  was  that  size  was  related  to  physical  strength,  and  the  effectiveness  of  a  police  officer’ sjob 
performance  depended  on  strength. 

An  important  case  on  the  use  of  a  height  and  weight  requirement  was  decided  in  June  1977  by 
the  U.S.  Supreme  Court.  In  Dothard  v.  Rawlinson,  a  woman  was  refused  employment  as  a  correc¬ 
tional-counselor  trainee  because  she  did  not  meet  the  minimum  height  and  weight  requirements 
of  62  inches  and  120  pounds.  The  standard  was  found  to  have  adverse  impact  because  it  excluded 
33.3  percent  of  the  women  and  only  1.3percent  of  the  men.  The  pass  rate  of  females  was  only  67 
percent  of  males,  thereby  failing  the  4/5s  rule  and  establishing  adverse  impact.  Once  adverse  impact 
was  established,  the  defendants  argued  that  the  height  and  weight  requirements  were  job  related 
because  they  have  a  relationship  to  strength,  which  isjob  related.  The  Supreme  Court  ruled  that  if 
strength  is  a  realjob  requirement,  then  a  direct  measure  of  strength  should  have  been  adopted.  The 
defendants  failed  to  prove  that  the  height  and  weight  requirement  was  a  business  necessity. 

The  following  cases  provide  instances  in  which  height  or  weight  requirements  have  been  sup¬ 
ported  in  the  courts.  In  Boyd  v.  Ozark  Airlines,  the  Court  ruled  in  favor  of  a  height  requirement. 
The  height  requirement  had  disparate  impact  against  women,  but  the  Court  ruled  that  a  minimum 
height  is  necessary  for  a  pilot  to  see  properly  and  reach  all  the  controls  in  an  airplane  cockpit 
(Hogan  &  Quigley,  1986).  In  Costa  (Le  Boeuf)  v.  Markey  (DC,  VA  1977)  the  Court  ruled  that 
imposing  a  5’6”height  requirement  on  a  list  of  exclusively  female  applicants  did  not  produce  a  dis¬ 
parate  effect  on  women.  The  ruling  of  the  Court  in  EEOC  v.  Delta  Air  Lines  (DC,TX  1980)  was 
that  requiring  different  height  and  weight  standards  for  male  and  female  airline  flight  attendants 
did  not  constitute  a  sex  discrimination  violation  of  Title  VII  because  weight  is  generally  subject  to 
one’s  own  control.  However,  in  another  airlines  case  that  Court  ruled  that  it  was  sex  discrimination 
to  require  female  airline  flight  attendants  to  maintain  certain  weight  levels  and  to  have  their  weight 
monitored  when  similar  requirements  were  not  placed  on  male  flight  attendants  (Gerdom  v. 
Continental  Airlines,  9th  Cir.  1982).  In  Meadows  v.  Ford  Motor  Co.  (DC,  KY  1973)  the  use  of  a 
minimum  weight  requirement  for  employees  was  unlawful  discrimination  on  the  basis  of  sex 
because  the  weight  requirement  was  not  shown  to  be  related  tojob  qualifications. 

Physical  Tests 

Of  the  10  cases  involving  physical  tests  reviewed  by  Hogan  and  Quigley  (1986),  all  involved 
police  and  firefighter  preemployment  tests.  One  of  the  cases  (Hull  v.  Cason)  was  for  race  discrim¬ 
ination  while  the  remaining  charged  sex  discrimination  under  Title  VII.  The  common  test  devel¬ 
opment  approach  that  emerged  from  these  cases  was  the  use  of  general  physical  ability  and  fitness 
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tests  such  as  sit-ups,  push-ups,  pull-ups,  squat  thrusts,  and  various  strength  tests.  Arvey  and  Faley 
(1988)  maintain  that  these  tests  are  less  likely  to  be  legally  supported  because  they  do  not  represent 
“samples” of  actual  work  behavior.  This  was  especially  evident  in  the  classic  1982  New  York  City 
firefighter  case,  Berkman  ti.  City  of  New  York.  The  physical  agility  test  items  were  selected  using 
the  constructs  defined  by  Fleishman  (Fleishman,  1964).  None  of  the  women  tested  passed  the  New 
York  City  firefighter  test  while  46  percent  of  the  men  did.  The  Court  stated,  "Nothing  in  the  con¬ 
cepts  of  dynamic  strength,  gross  body  equilibrium,  stamina,  and  the  like,  has  such  a  grounding  in 
observable  behavior  of  the  way  firefighters  operate  that  one  could  say  with  confidence  that  a  per¬ 
son  who  possesses  a  high  degree  of  these  abilities  as  opposed  to  others  will  perform  well  on  the  job” 
(Arvey  6c  Faley,  1988, p.  279). 

Listed  below  are  the  decisions  of  the  nine  sex  discrimination  cases  reviewed  by  Hogan  and 
Quigley  (1986).  The  common  denominator  of  these  cases  was  that  each  used  common  fitness  items 
and,  in  some  instances,  in  combination  with  work  simulation  tests  such  as  a  dummy  drag.  The  rul¬ 
ing  of  only  one  case  (Hardy  ti.  Stumpf)  supported  the  defendant’s  use  of  the  test.  Following  are  the 
cases  and  the  legal  decisions — 

1.  Hall  w.  White — The  physical  agility  tests  (i.e.,  squat  thrusts,  sit-ups,  push-ups,  squat 
jumps,  and  pull-ups)  were  not  found  to  be  job  related. 

2.  Officers  for  Justice  v.  Civil  Service  Commission —  The  physical  agility  tests,  which  were 
primarily  upper  body  strength  tests,  did  not  predict  job  performance. 

3.  Hardy  v.  Stumpf —  The  tests  were  found  reasonable,  were  supported  by  job  analysis,  and 
were  not  in  violation  of  Title  VII. 

4.  United  States  v.  City  of  Buffalo  —  The  tests  used  a  weighted  sum  for  height  and  weight 
and  agility  score.  This  method  gave  an  advantage  to  taller  persons,  and  the  defendants 
were  enjoined  from  further  use  of  the  method. 

5.  Blake  v.  City  of  Los  Angeles  —  Job  relatedness  was  not  established  for  tests  that  combined 
running  with  job-related  tests  (e.g.,  scale  6-foot  wall  and  drag  140-pound  dead  weight), 
and  the  validation  studies  were  flawed. 

6.  United  States  v.  Philadelphia — The  tests  (0.5  mile  shuttle  run,  obstacle  course,  jump 
reaction  time,  and  grip  strength)  did  not  showjob  relatedness. 

7.  United  States  w.  New  York — The  tests  were  work  simulations  (e.g.,  shotgun  aiming,  tire 
change,  etc.),  but  the  job  analysis  was  inappropriate  for  content  validation  and  a  different 
scoring  strategy  could  have  reduced  adverse  impact. 

8.  Harless  v.  Duck — Tests  (i.e.,  push-ups,  sit-ups,  and  standing  broad  jump)  were  not  proved 
valid  or  job  related,  and  job  analysis  did  not  specify  amount  of  strength  exertion  required. 

9.  Berkman  v.  City  of  New  York — The  validation  strategy  was  inappropriate,  and  should 
have  used  construct  or  criterion-related  validity.  The  physical  tests  were  dummy  carry,  grip 
strength,  longjump,  flexed  arm  hang,  agility  test,  ledge  walk,  and  mile  run. 
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Lilting 


Some  employers  have  attempted  to  argue  that  gender  is  a  bona  fide  occupational  qualification 
(BFOQ)  for  lifting  and  materials  handling  tasks.  The  defense  for  BFOQ_allows  for  intentional  clas¬ 
sification  of  applicants  or  employees  in  the  narrow  circumstances  in  which  such  a  classification  is 
judged  as  a  reasonable  business  necessity.  If  gender  is  a  valid  BFOC^an  employer  can  lawfully  refuse 
to  hire  a  person  on  the  basis  of  his  or  her  gender  (Rothstein  et  al.,  1999).  The  EEOC  Guidelines 
interpret  this  defense  very  narrowly.  Some  examples  of  BFOQJrased  on  gender  are  rest-room  atten¬ 
dant  and  acting  parts  for  male  and  female  roles  (Hogan  8c  Quigley,  1986;  Rothstein  et  al.,  1999). 

In  response  to  the  deleterious  working  conditions  encountered  by  women  during  the  industri¬ 
al  revolution,  many  states  passed  laws  that  “protected”  women  from  physical  labor.  The  courts  have 
ruled  that  gender  is  not  a  BFOQJBr  lifting  and  materials  handling  physical  tasks. The  finding  of 
the  Supreme  Court  case,  Dothard  v.  Rawlinson  [435  U.S.  702,  98  S.Ct.  1370,55  L.Ed.2d  657 
(1978)],  was  that  an  employer  cannot  reject  a  female  applicant  who  is  capable  of  performing  the 
job  requirements  solely  because  many  other  members  of  her  gender  group  cannot  do  so  (Hogan  8c 
Quigley,  1986;  Rothstein  et  al.,  1999).  In  the  1969  case,  Weeks  v.  Southern  Bell  Telephone  8c 
Telegraph  Company,  the  company  did  not  consider  a  female  employee’s  bid  for  a  job  vacancy  as 
switchman  because  the  job  had  a  30-pound  lifting  requirement.  Although  Southern  Bell  contend¬ 
ed  that  the  switchman’s  job  was  “strenuous, ’’the  Court  ruled  that  Southern  Bell  did  not  show  that 
the  job  was  so  strenuous  that  most  women  could  not  perform  it  and  that  the  lifting  requirement 
was  based  on  a  “stereotyped  characterization.”  Southern  Bell  lost  the  case  because  they  did  not 
establish  the  validity  of  the  lifting  requirement  (Arvey  8c  Faley,  1988;  Hogan  8c  Quigley,  1986). 
The  EEOC  ruled  (EEOC  Decision  No  71-1 868  April,  22,1971)  that  an  employer’ s use  of  only  a 
few  women  as  a  small  sampling  of  all  women  to  perform  lifting  work  was  insufficient  to  establish 
that  women  were  not  qualified  for  jobs. 

The  courts  have  ruled  that  it  is  unlawful  to  disqualify  women  from  being  assigned  to  a  job  if  an 
alternative  exists  with  respect  to  the  heavy  lifting.  In  the  case,  McLean  v.  State  of  Alaska  (Alas 
1978, SCt.  18EPD  §  8787,583  P2d  867),  the  taskinvolved  carrying  100-pound  bundles  of  laun¬ 
dry.  The  Court  ruled  that  an  alternate  was  available  by  making  up  laundry  in  bundles  less  than  100 
pounds.  The  position  of  the  Court  was  that  designating  this  job  as  a  male  job  was  unlawful.  This 
position  is  consistent  with  the  sound  ergonomic  practice  of  engineering  excessive,  physically 
demanding  demands  out  of  the  job  by  redesigning  the  job  (Waters,  Putz-Anderson,  Garg,  8c  Fine, 
1993).  A  problem,  however,  is  that  it  often  is  not  easy  or  even  possible  to  redesign  the  job. 

Physiological  Parameters 

Strength,  aerobic  capacity,  and  body  composition  are  physiological  parameters  that  have  been 
used  for  making  employment  decisions.  This  section  briefly  summarizes  relevant  cases.  In  addition 
to  cases  that  were  resolved  by  the  courts,  two  additional  EEOC  cases  that  did  not  result  in  a  final 
legal  opinion  are  discussed. 
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Submaximal  VC^max  Test  —  In  a  1992  case  involving  a  preemployment  test  for  entry-level  mill 
worlters  in  a  company's  logging  and  sawmill  operations,'  physiological  test  principles  helped  decide 
the  case.  The  preemployment  test  used  by  the  timber  company  was  ruled  illegal  because  the  test 
disproportionately  excluded  women  qualified  for  the  jobs.  The  test  had  three  items:  board  pull 
ergometer  to  measure  strength  (pulling  30-,  SO-,  and  70-pound  weights  for  specified  durations),  a 
6-minute  step  test  using  an  1 1 -inch  bench;  and  a  visual  assessment  of  the  applicant's  gross  body 
coordination.  The  legal  problem  was  with  the  step  test. 

The  step  test  required  the  applicant  to  wear  a  heart  rate  monitor.  The  test  required  that  the  sub¬ 
ject  exercise  at  an  intensity  of  lOmetabolic  equivalents  (METs).The  cut-score  for  the  test  was  the 
physiological  capacity  to  exercise  <  85  percent  of  the  applicant's  heart  rate  estimated  maximum  aer¬ 
obic  capacity  using  a  10-METpower  output.  Since  the  female  passing  rate  on  the  test  was  42.4 
percent  of  the  male  passing  rate,  the  test  showed  disparate  impact  under  the  Uniform  Guidelines 
4/5s  rule.  Thejudge  stated, 

Simpson  has  met  its  burden  of  showing  that  the  test  isjob  related  and  serves,  in  a  significant  way, 
the  company's  legitimate  employment  goal  of  hiring  a  physically  ft  workforce,  in  that  those  who 
pass  the  test,  as  a  group,  are  likelier  to  be  able  to  do  thejobs  adequately  and  safely  than  are  those,  as 
a  group,  who  do  not  pass  the  test.  The  test  nonetheless  unnecessarily  excludes  qualified  applicants,  a 
disproportionate  number  of  whom  are  women. 

The  judge  ruled  that  the  methods  used  to  administer  a  step  test  introduced  gender  bias,  and  the 
step  test  cut-score  of  lOMETs  should  be  reduced  to  8.5  METs,  the  cut-score  used  in  testing  exist¬ 
ing  employees  who  sought  a  transfer  from  one  division  of  the  company  to  another.  The  judge  fur¬ 
ther  ruled  that  part  of  the  adverse  impact  was  due  to  the  failure  to  use  an  adjustable  height  bench 
or  different  height  benches  for  men  and  women,  and  that  this  would  not  change  the  selection 
device  except  to  increase  accuracy.  Lastly,  part  of  the  adverse  impact  was  attributed  to  poor  test 
administration.  Thejudge  stated  that  in  some  instances,  women  applicants  were  not  given  timely 
instructions  about  what  they  should  eat  or  drink  before  the  test;  men  and  women  were  required  to 
wait  together  while  the  test  was  administered;  and  female  applicants  were  treated  in  ways  that 
caused  tension  and  anxiety,  which  could  affect  the  outcome  of  the  step  test. 

The  use  of  heart  rate-scored  employment  tests  should  be  viewed  with  caution.  A  major  prob¬ 
lem  with  heart  rate-monitored  tests  is  that  many  applicants  take  medically  prescribed  medications 
that  affect  heart  rate.  For  example,  beta-blocker  medication  is  commonly  prescribed  for  hyperten¬ 
sion.  Although  the  medication  is  effective  in  lowering  blood  pressure,  it  also  lowers  resting,  exer¬ 
cise,  and  maximum  heart  rate.  The  person's  drug-affected  maximum  heart  rate  would  need  to  be 
known  to  accurately  determine  exercise  heart  rate  at  a  given  percentage  of  maximum  aerobic  capac¬ 
ity.  This  would  not  likely  be  known. 

Aerobic  Capacity — The  recent  court  ruling  of  Lanning  v.  SEPTA  (U.S.3'J  Circuit  1 999)  is  like¬ 
ly  to  have  a  major  impact  on  the  use  of  physiological  variables  to  establish  employment  cut-scores. 
SEPTA  is  regional  mass  transit  authority  that  operates  principally  in  Philadelphia,  Pennsylvania. 
A  job  analysis  showed  that  SEPTA  police  officers  had  to  chase  suspects,  and  this  often  involved 
running  up  stairs.  Subject  matter  experts  (SMEs)  were  interviewed  by  an  exercise  physiologist  to 
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determine  the  level  of  physical  exertion  necessary  to  perform  tasks.  The  SMEs  reported  that  a  rea¬ 
sonable  level  of  physical  exertion  was  t  o  run  1  mile  in  full  gear  in  11.78  minutes  .The  exercise  phys¬ 
iology  expertjudged  this  as  too  low  (i.e.,  11.78  minutes  per  mile)  and  recommended  the  1.5-mile 
run  test  with  a  more  demanding  cut-score  of  12  minutes  (he.,  8.00  minutes  per  mile).  This  cut- 
score  represented  a  VC^max  of  42.5  ml/kg/ min,  which  was  the  same  level  recommended  for  select¬ 
ing  firefighters  (Davis,  1992). The  test  was  administered  to  incumbents,  and  the  pass  rates  were  6.7 
percent  forwomen  and  55.6  percent  for  men.  The  pass  rate  ofwomen  was  only  12.1percent  of  the 
men’s,  substantially  lower  than  the  80  percent  required  by  the  4/5s  rule  of  the  EEOC  Guidelines. 

Expert  witness  evidence  for  the  defendant  demonstrated  that  aerobic  fitness  was  related  and  an 
important  ability  required  by  the  transit  police  officer’ sjob.  Evidence  was  presented  to  show  that 
the  aerobic  capacity  of  more  than  52  percent  of  the  persons  arrested  was  48  ml/kg/min,  and  only 
27  percent  of  those  arrested  had  an  aerobic  capacity  of  less  than  42  ml/kg/min.  Additional  evidence 
was  presented  that  showed  a  statistically  significant  correlation  between  aerobic  capacity  and 
arrests,  arrest  rates,  and  service-related  awards.  Although  these  data  indicated  that  aerobic  capaci¬ 
ty  was  an  essential  determinant  of  job  performance,  the  court  ruled  that  a  discriminatory  cutoff 
score  of  the  capacity  to  run  1.5  miles  in  12  minutes  (VC^max  =  42.5ml/kg/min)  is  impermissible 
unless  it  represented  the  minimum  qualification  necessary  for  successful  performance  of  the  job  in 
question.  The  legal  foundation  for  this  ruling  was  the  Supreme  Court  interpretation  from  Griggs 
v.  Duke  Power  Co.  on  the  business  necessity  doctrine.  The  Court’s  interpretation  was  that  a  dis¬ 
criminatory  cutoff  score  must  be  validated  to  show  that  it  measures  the  minimum  qualifications 
necessary  for  successful  performance  of  the  job.  The  Court  further  ruled  that  this  was  consistent 
with  EEOC  Guidelines  that  the  cut-score  “be  set  so  as  to  be  reasonable  and  consistent  with  nor¬ 
mal  expectations  of  acceptable  proficiency  within  the  work  force. ’’The  Court  went  on  to  further 
declare  that  this  is  the  only  way  to  be  certain  to  eliminate  the  use  of  excessive  cutoff  scores  that 
have  a  disparate  impact  on  minorities. 

A  stated  goal  of  SEPTA  was  to  respond  to  a  perceived  need  to  upgrade  the  quality  of  the  police 
force.  The  Court  indicated  that  there  were  three  options  open  to  help  SEPTA  achieve  its  stated 
goal  of  increasing  the  aerobic  capacity  of  its  police  officers  and  to  be  consistent  with  Title  VII.  The 
options  listed  in  the  court  ruling  were  as  follows — 

1.  Abandon  the  test  as  a  hiring  requirement  but  maintain  an  incentive  program  to  encour¬ 
age  an  increase  in  the  officer’s  aerobic  capacities 

2.  Validate  a  cutoff  score  for  aerobic  capacity  that  measures  the  minimum  capacity  necessary 
to  successfully  perform  the  job  and  maintain  incentive  programs  to  achieve  even  higher 
aerobic  levels 

3.  Institute  a  nondiscriminatory  test  for  excessive  levels  of  aerobic  capacity,  such  as  a  test  that 
would  exclude  80  percent  of  men  as  well  as  80  percent  of  women  through  a  separate  aer¬ 
obic  capacity  cutoff  for  the  different  sexes. 

Defining  the  aerobic  intensity  of  work  tasks  is  well  founded  in  the  discipline  of  exercise  physi¬ 
ology  and  consistent  with  the  Lanning  v.  SEPTA  court  ruling.  Energy  cost  tables  for  common 
work  tasks  and  recreational  activities  are  published  in  several  sources  (Astrand  <3c  Rodahl,  1970; 
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Durnin  8c  Rahaman,  1967;McArdle  et  al.,  1991;Passmore  8cDurnin,  1955).  These  estimates  are 
expressed  in  kilocalories  per  minute,  oxygen  consumption,  or  metabolic  equivalents  (METs). 

A  current,  important  research  focus  is  to  define  the  energy  cost  needed  to  fight  fires.  This 
research  effort  can  be  attributed  to  the  amount  of  litigation  leveled  at  the  validity  of  firefighter  pre¬ 
employment  tests  and  the  use  of  age  to  terminate  employment.  Several  investigators  (Barnard  8c 
Duncan,  1975;  Davis  8c  Dotson,  1978;  Lemon  8c  Hermiston,  1977;  Manning  8c  Griggs,  1983; 
O’Connell,  Thomas,  Caddy,  8c  Karwasky,  1986;  Sothmann,  Saupe,  Jasenor,  8c  Blaney,  1992)  pub¬ 
lished  data  showing  that  fire  suppression  work  tasks  have  a  substantial  aerobic  component.  In  an 
important  study,  Sothmann  and  a  team  of  researchers  (Sothmann  et  al.,  1990)  provide  strong  evi¬ 
dence  that  the  minimum  VC^max  required  to  meet  the  demands  of  fire  fighting  is  33.5  ml/kg/min. 
The  authors  used  a  work  sample  test  involving  sevenjob-related  firefighter  tasks.  The  sensitivity 
(percentage  of  correctly  classified  unsuccessful  performers)  and  specificity  (percentage  of  correctly 
classified  successful  performers)  for  a  V02max  cut-score  of  33.5  ml/kg/min  was  67  percent  and  83 
percent,  respectively.  Lowering  the  cut-score  to  30.5  ml/kg/min,  dropped  the  sensitivity  to  25  per¬ 
cent  and  increased  the  specificity  to  95  percent. 

The  VC^max  cut-score  of  33.5  ml/kg/min/year  was  used  to  help  decide  Smith  v.  City  of  Des 
Moines,  Iowa  (U.S.  District  Court  Southern  District  of  Iowa-Central  Division,  1995)  firefighter 
case.  The  City  of  Des  Moines  requires  that  all  firefighters  be  certified  to  wear  a  respirator 
(SCUBA).  The  city’s  pulmonologist,  in  cooperation  with  fire  department  personnel,  developed  a 
testing  program  on  the  appropriate  level  of  cardiopulmonary  fitness  necessary  for  SCUBA  certifi¬ 
cation.  The  standard  consisted  of  two  parts:  First,  the  standard  was  met  if  the  firefighter’s  FEV1 
was  >  70  percent.  If  the  FEV1  was  below  70  percent,  the  firefighter  would  be  required  to  take  a 
maximum  exercise  test  with  a  minimum  cut-score  of  33.5  ml/  kg/min. 

The  firefighter  alleged  employment  discrimination  against  the  city  of  Des  Moines  because  of 
his  age.  The  Court  ruled  that  the  duties  of  a  firefighter  are  inherently  dangerous  and  that  the  fire 
department  requires  its  firefighters  to  have  the  level  of  fitness  needed  to  respond  immediately  and 
effectively  to  emergencies.  Further,  the  Court  held  that  the  standards  were  reasonable  and  based  on 
the  demands  of  firefighting.  The  plaintiff  claimed  that  the  VC^max  standard  violates  the  age  dis¬ 
crimination  prohibition  because  the  likelihood  of  failure  increases  with  age.  The  Court  ruled  that 
the  33.5  cut-score  for  an  person  of  the  plaintiff’s  age  to  be  in  “average” or  “good”condition  and  was 
therefore  not  unreasonable. 

Although  the  33.5  ml/kg/min  value  reflects  the  level  of  aerobic  fitness  required  to  meet  the 
physiological  demands  of  firefighting,  the  cut-score  often  recommended  is  in  the  40s  (Davis  8c 
Dotson,  1992;  Sothmann  et  al.,  1990). The  rationale  used  is  that  aerobic  fitness  declines  with  age. 
The  cross-sectional  rate  of  decline  in  VC^max  is  about  0.4  to  0.5  ml/kg/min/year  (Buskirk  8c 
Hodgson,  1987;Jackson  et  al.,  1995;Jackson  et  al.,  1996).  Although  the  average  decline  is  well  doc¬ 
umented,  there  is  evidence  that  people  vary  considerably  in  the  rate  that  their  aerobic  capacity 
declines  with  age.  Both  cross-sectional  and  longitudinal  data  (Jackson  et  al.,  1995;Jackson  et  al., 
1996;  Kasch,  Boyer,  VanCamp,  Verity,  &Wallace,  1990)  suggest  that  about  50  percent  of  the  rate 
that  aerobic  capacity  declines  with  age  is  due  to  differences  in  lifestyle.  The  rate  of  decline  for  those 
who  remain  aerobically  active  and  maintain  their  level  of  body  composition  is  estimated  to  be  about 
0.25  ml/kg/min/year  compared  with  the  average  of  about  0.5  ml/kg/min/year. 
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The  logic  of  using  a  VC^max  cut-score  in  the  40s  with  young  applicants  is  that  firefighters 
would  not  have  the  physiological  capacity  to  meet  the  firefighter  demands  as  they  age  (Sothmann 
et  al.,  1990). The  Lanning  v.  SEPTA  and  Smith  a. City  of  Des  Moines  court  rulings  suggest  that 
setting  cut-scores  above  the  33.5  ml/kg/min  may  be  legally  hazardous.  Setting  firefighter  aerobic 
capacity  cut-scores  higher  than  33.5  ml/kg/min  in  an  effort  to  account  for  potential  aging  effects 
increases  the  disparate  impact  for  both  sex  and  age  discrimination.  Table  7.2  illustrates  the  poten¬ 
tial  influence  of  aging  and  gender  on  cut-scores  of  33.5  and  42.5  ml/kg/min.  Using  the  data  from 
two  large  NASA/Johnson  Space  Center  samples  (Jackson  et  al.,  1995;Jackson  et  al.,  1996),  the 
average  VC^max  for  men  and  women  for  selected  ages  was  estimated.  Using  the  standard  errors 
for  regression  models  used  to  estimate  the  cross-sectional  decline  in  VC^max  of  men  and  women, 
and  the  normal  curve,  the  proportion  of  men  and  women  for  selected  ages  who  would  pass  cut- 
scores  of  33.5  and  42.5  ml/kg/min  was  estimated.  The  male  and  female  pass  rates  for  each  age  were 
used  to  estimate  the  likelihood  of  disparate  impact  for  gender.  Table  7.2  shows  that  the  pass  rate 
for  both  men  and  women  is  inversely  related  with  age.  Obviously,  the  pass  rate  of  both  men  and 
women  is  lower  for  each  age  group  and  cut-score,  showing  that  both  age  and  cut-score  affect  pass 
rates.  More  important,  both  age  and  cut-score  affect  the  likelihood  of  adverse  impact  against 
women.  Table  7.2  documents  that  as  age  and  cut-score  increase,  the  likelihood  of  a  disparate  impact 
for  gender  increases.  On  the  basis  of  the  Lanning  a.  SEPTA  ruling  and  the  trends  shown  in  Table 
7.2,  it  appears  that  setting  aerobic  capacity  firefighter  standards  is  an  uncertain  legal  task  unless  a 
minimum  job-related  standard  can  be  validly  defined. 


Table  7.2  Average  V02max  (ml/kg/min)  for  males  and  females  of  selected  ages,  percentages  that  would 
pass  the  cut-score,  and  4/5s  rate  for  adverse  impact 


Average  Vt^max 

Cut-score  33.5  ml/kg/min 

Cut-score  42.5  ml/kg/min  1 

Female 

Maie 

Females  %Pass 

Males  %Pass 

Females  %Pass 

Males  %Pass 

25 

39.3 

48.0 

81.6 

97.9 

30.8 

77.9 

35 

33.9 

43.4 

52.4 

91.8 

9.0 

54.0 

45 

28.5 

38.8 

21.8 

77.0 

1.5 

30.1 

55 

23.1 

34.2 

5.3 

54.0 

>0.1 

12.1 

65 

17.7 

48.0 

0.7 

29.1 

>0.1 

Strength —  Hogan  (1991)  reports  that  the  physiological  construct  related  to  most  industrial  tasks 
is  strength.  Not  only  is  a  sufficient  level  of  strength  necessary  to  perform  many  common  industri¬ 
al  tasks,  the  lack  of  sufficient  strength  demanded  by  the  task  is  likely  to  increase  the  risk  of  injury 
(Dehlin,  Hendenrud,  &  Horal,  1 976; Herrin, Jaraiedi,  8c  Anderson,  1986;Magora,  1970;Snook, 
Campanelli,  8c  Hart,  1978). 

In  the  case  EEOC  a.  Shell  Western  E  and  P  (U.S.  District  Court,  Central  District  of 
California)  a  preemployment  strength  test  was  challenged  for  sex  discrimination.  The  test  was  used 
to  screen  applicants  for  entry-leveljobs  at  oil  and  gas  production  facilities.  The  task  analysis  iden¬ 
tified  the  physically  demanding  tasks  to  be  push-and-pull  forces  required  when  completing  well¬ 
pulling  tasks.  The  amount  of  push-and-pull  force  required  to  perform  the  tasks  was  measured  and 
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used  to  define  the  cut-score.  The  sum  of  grip,  arm,  and  torso  isometric  strength  was  found  to  be 
highly  correlated  (>  0.80)  with  work-sample  push/pull  tests.  Regression  models  were  used  to  define 
the  level  isometric  strength  needed  to  generate  the  level  of  push-and-pull  force  required  by  the  task. 
This  isometric  strength  level  was  used  to  define  the  cut-score  (Laughery  &Jackson,  1982). 

The  isometric  strength  tests  were  shown  to  have  a  disparate  impact  based  on  sex. This  was  expect¬ 
ed  and  consistent  with  data  showing  that  the  strength  of  the  average  woman  is  about  50  percent  of 
the  strength  of  men  (Baumgartner  &Jackson,  1999;McArdle  et  al.,  1991;NIOSH,  1977;Wilmore 
6c  Costill,  1994).  A  major  concern  raised  by  the  plaintiffs'  expert  witness  was  that  the  isometric 
strength  tests  just  measured  upper  body  strength,  and  he  argued  that  women  performed  work  tasks 
like  those  identified  by  the  task  analysis  with  their  legs.  A  major  argument  raised  was  the  failure  to 
also  measure  leg  strength.  In  addition  to  failure  to  measure  leg  strength,  evidence  introduced  by  the 
plaintiffs  suggested  that  the  physical  demands  of  the  job  changed.  This  resulted  in  an  agreement 
between  the  EEOC  and  the  defendant  to  redo  the  validation  study  and  to  include  a  leg  strength  test. 

The  task  analysis  of  the  new  validation  study  documented  that  the  physical  demands  of  the  job 
had  changed  (Jackson,  Osburn,  Laughery,  &,  Sekula,  1998).The  physically  demanding  well-pulling 
work  was  now  being  done  by  contract  labor. The  physically  demanding  tasks  required  by  the  current 
workers  were  valve  cracking  and  lifting  valves  that  weighed  75  pounds.  The  validation  study  was 
completed  under  the  supervision  of  an  exercise  physiologist  appointed  by  the  EEOC.  The  study 
documented  that  isometric  arm,  shoulder,  torso,  and  leg  strength  were  correlated  with  valve  crack¬ 
ing  and  lifting  work-sample  tests.  The  isometric  strength  tests  were  also  found  to  be  correlated  with 
supervisor  ratings  of  the  worker's  physiological  ability  to  do  the  work.  Simple  linear  and  logistic 
regression  models  were  used  to  define  the  minimum  level  of  strength  required  to  do  these  tasks. 
These  data  were  used  to  define  the  cut-score.  Post  hoc  analysis  showed  that  all  female  incumbents 
exceeded  the  cut-score.  Presently,  the  EEOC-supervised  validation  study  has  not  been  challenged. 

Body  Composition  —  A  major  issue  in  the  case  of  the  EEOC  v.  Mountain  States  Telephone  and 
Telegraph  Co.,  d/b/a  U.S.  West  Communications  (U.S.  District  Court  for  the  District  of  New 
Mexico)  was  the  use  of  skinfold  fat  to  select  workers  for  outdoor  telephone  craft  jobs.  An  impor¬ 
tant  physically  demanding  task  of  these  craft  jobs  was  pole  climbing.  The  issues  leading  to  the 
development  of  this  study  were  the  large  differences  between  male  and  female  workers  in  turnover 
and  accident  rates.  After  six  months,  43  percent  of  the  women  left  the  outdoor  craft  jobs  compared 
with  only  8  percent  of  the  men.  An  extensive  job  analysis  showed  that  pole  climbing  was  an  essen¬ 
tial,  physically  demanding  work  task.  Accident  data  showed  that  women  sustained  substantially 
more  injuries  than  men  from  falls  while  climbing  or  working  on  poles. 

Using  the  results  of  a  pilot  study  completed  by  exercise  physiologists  (Bernauer  &  Bonanno, 
1975),  industrial/organizational  psychologists  (Reilly,  Zedeck,  ScTenopyr,  1979)  completed  a  cri¬ 
terion-related  validation  study  designed  to  develop  a  test  for  selecting  applicants  with  the  physio¬ 
logical  capacity  to  climb  poles  safely.The  four  criteria  of  job  performance  were — 

1.  time  to  complete  the  pole-climbing  school, 

2.  completion  of  pole-climbing  school  (a  number  withdrew  from  the  course), 

3.  field  observations  of  pole-climbing  proficiency,  and 

4.  accidents  for  six  months  after  entering  outdoor  craft  work. 
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A  series  of  physical  ability  tests  and  the  pole-climbing  criteria  were  obtained  on  a  sample  con¬ 
sisting  of  78  female  and  132  male  pole-climbing  school  applicants.  Multiple  regression  selected  a 
three-item  battery  consisting  of  body  density  estimated  from  skinfold  fat,  balance,  and  an  isomet¬ 
ric  arm  strength  test.  The  statistically  significant  correlations  between  the  three  tests  and  the  four 
criteria  were  time  to  complete  the  course,  0.46;  training  drop-out,  0.38;  field  observations  of  pole¬ 
climbing  proficiency,  0.53;  and  accidents,  0.15. 

The  three-item  test  was  used  to  select  and  disqualify  applicants  for  pole-climbing  school. 
Successful  completion  of  pole-climbing  school  was  a  requirement  for  the  outside  craft  position.  In 
the  validation  study  (Reilly  et  al.,  1979),  body  density  of  the  male  and  female  subjects  was  estimated 
with  gender- specific  skinfold  equations  (Sloan,  1967;  Sloan,  Burt,  8c  Blyth,  1962).  A  perceived 
limitation  of  these  equations  was  the  use  of  different  combinations  of  skinfolds.ln  an  effort  to  stan¬ 
dardize  test  procedures,  the  final  battery  used  just  triceps  skinfold  because  this  site  was  common  to 
both  male  and  female  equations.lt  is  well  documented  in  the  literature  that  the  women’s  triceps 
skinfold  is  significantly  larger  than  that  of  men’s.  To  illustrate,  the  means  (±  SD)  for  triceps  skin¬ 
fold  of  the  women  and  men  used  to  develop  generalized  skinfold  equations  (Jackson  &  Pollock, 
1978;Jackson,  Pollock,  8c  Ward,  1980)were  18.2+  5.9forwomen;  and  for  men,  14. 2±  6.1. Using 
a  common  cut-score  for  men  and  women  resulted  in  a  disparate  impact  against  women. 

The  argument  made  for  the  female  plaintiffs  was  that  the  use  of  triceps  skinfolds  was  discrim¬ 
inatory  because  of  the  gender- specific  difference  in  triceps  skinfold.  Although  it  is  physiologically 
sound  to  assume  that  persons  with  high  levels  of  body  fatness  will  have  more  difficulty  climbing 
poles,  and  the  statistical  evidence  of  the  criterion-related  validation  study  supported  this  assump¬ 
tion,  the  defendants  in  the  case  agreed  to  stop  using  the  test.  Although  this  case  did  not  progress 
to  the  point  of  a  court  ruling,  it  does  illustrate  the  dangers  of  using  body  composition  data  for 
defining  employment  cut-scores.  This  agreement  resulted  in  the  development  of  new  tests  for  the 
telephone  industry  that,  at  the  time  this  chapter  was  written,  has  not  been  challenged  in  court. 


A  View  to  the  Future 


With  all  the  complexities  that  affect  the  U.S.judicial  system,  trying  to  determine  what  will  hap¬ 
pen  in  the  future  is  very  risky.  Some  recent  U.S.  Supreme  Court  decisions  do  suggest  that  discrim¬ 
ination  as  defined  by  ADEA  and  ADA  is  changing  — 

The  two  recent  Supreme  Court  cases  of  Sutton  v.  United  Airlines,  Inc.,  and  Murphy  v. 
United  Parcel  Service,  Inc.,  showed  that  the  burden  of  documenting  a  disability  clearly  rests 
with  the  employee  and  that  an  employee  who  fails  to  control  a  controllable  disability  may 
lose  protection  under  ADA  (Rothstein  et  al.,  1999). A  question  may  be  raised  whether  fit¬ 
ness,  especially  weight  or  body  composition,  will  be  viewed  as  “controllable  disability.” 

•  A  recent  U.S.  Supreme  Court  ruling  struck  down  part  of  a  Federal  law  that  allowed  state 
employees  to  sue  employers  for  age  discrimination.  The  case  was  brought  on  by  senior  col¬ 
lege  professors  who  were  paid  less  than  their  younger  colleagues.  The  Court  ruled  that  older 
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workers  have  no  real  constitutional  protection  against  discrimination  by  state  employees. 
Although  the  ruling  does  not  seem  to  have  a  direct  application  to  physical  testing,  the  text 
from  the  court  ruling  suggests  that  it  may  have  relevance  to  public-safety  positions.  A  quote 
of  the  court  ruling  from  the  Houston  Chronicle  states,  “As  a  result,  states  can  discriminate 
against  older  workers  merely  by  showing  that  the  discrimination  is  rationally  related  to  a 
legitimate  state  interest,  the  court  said.  For  example,  a  state  can  have  a  mandatory  retire¬ 
ment  age  of  50  for  police  based  on  its  desire  to  ensure  that  all  of  its  officers  are  in  top  phys¬ 
ical  condition,  the  court  added”  ( Houston  Chronicle, January  12,2000). 


Endnotes 

1.  Findings  of  Fact  and  Conclusions  of  Law  as  to  Liability,  EEOC  v.  Simpson  Timber  Company,  U.S. 
District  Court,  Western  District  of  Washington,  1992. 
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Appendix  A 


Physical  Fitness  and 
Specific  Health  Outcomes 


Overweight  and  Obesity 


Obesity  has  been  defined  as  “Achronic  disease  characterized  by  an  excessively  high  amount  of 
body  fat  in  relation  to  lean  body  mass”  ( Hoeger  Sc  Hoeger,  1998). The  ACSM  has  offered  the  gen¬ 
eral  criterion  of  ..a  percent  body  fat  that  increases  disease  risk”  (1991),  which  might  be  further 
clarified  as  ..weighing  20  percent  more  than  recommended  for  a  given  body  height”  (Nieman, 
1998).  This  begs  the  question  of  what  defines  the  recommended  weight  in  the  first  place.  When 
using  the  widely  accepted  body  mass  index  (BMI),calculatedby  dividing  body  weight  in  kilograms 
by  height  in  meters  squared  (kg/m2),  a  measurement  greater  than  or  equal  to  25  is  classified  as  over¬ 
weight  and  obese  (>  30)  (NIH  guidelines  #  98-4803). 

Unfortunately,  the  current  lifestyle  and  environment  of  most  Americans  exacerbates  the  accu¬ 
mulation  of  excessive  body  fat.  With  multiple  modern  conveniences  and  transportation  that 
requires  little  or  no  physical  exertion,  there  are  few  daily  chores  that  necessitate  even  moderate 
activity.  Coupling  inactivity  with  the  high-fat,  large -portioned  convenience  foods  served  at  restau¬ 
rants  and  kiosks,  there  is  no  mystery  to  the  increasing  body  weight  and  BMI  of  our  population. 
According  to  the  U.S.  Department  of  Health  and  Human  Services,  an  estimated  1 07  million  U.S. 
adults  are  overweight  or  obese  and  the  numbers  are  increasing  (2000).  With  more  than  half  of  the 
adults  in  the  U.S.  overweight  or  obese,  it  is  important  to  consider  the  impact  on  overall  health  and 
morbidity  and  its  implications  for  our  society  as  well  as  for  the  quality  of  life  of  our  citizens. 
Obesity  has  been  linked  to  type  II  diabetes  (West,  1978;  Must  et  al.,  1999;  Hu,  et  al.,  1999),  car¬ 
diovascular  disease  (Lee  et  al.,  1999;  Hubert  et  al.,  1983),  hypertension  (Bray,  1985),  certain  types 
of  cancer  and  overall  mortality  (Lee  et  al.,  1 999).  Obesity  is  considered  a  major  contributor  to  many 
causes  of  disease  and  death,  accounting  for  15  percent  to  20  percent  of  the  annual  mortality  rate 
(Hoeger  8c  Hoeger,  1998). 
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Using  BMI  to  calculate  obesity  level.  Must  et  al.,  (1999)  described  the  relationship  between  the 
level  of  obesity  and  the  prevalence  of  type  II  diabetes,  gallbladder  disease,  coronary  heart  disease, 
high  blood  cholesterol,  high  blood  pressure,  and  osteoarthritis.  The  study  was  a  cross-sectional  sur¬ 
vey  that  used  data  from  the  Third  National  Health  and  Nutrition  Examination  Survey  (NHANES 
111)  that  was  conducted  in  two  phases  from  1988to  1994.Atotal  of  16,884  adults  (25  years  or  older) 
were  in  the  sample,  all  of  whom  were  classified  as  obese  (BMI  greater  than  or  equal  to  25  kg/m2). 
From  these  survey  demographicsit  was  determined  that  63  percent  of  men  and  55  percent  of  women 
had  a  BMI  of  25  or  greater  thus  classifying  them  as  overweight  or  obese.  The  results  showed  that 
“theprevalence  of  two  or  more  health  conditions  increased  with  weight  status  category  (overweight 
or  obese)  across  all  racial  and  ethnic  subgroups”(Must  et  al.,  1999).  Accordingto  this  study,  the  more 
overweight  an  individual  was,  the  greater  the  chance  of  developing  multiple  health  problems. 

Together  with  its  associated  health  problems,  the  1 995  price  tag  of  obesity  amounted  to  approx¬ 
imately  $99  billion  in  medical  expenses  and  lost  productivity  (U.S.  Department  of  Health  and 
Human  Services,  2000).  That  number  is  undoubtedly  climbing.  Such  an  extensive  list  of  grave 
complications  that  arise  with  the  incidence  of  obesity  coupled  with  its  medical  costs  and  lost  pro¬ 
ductivity  demands  serious  attention  be  paid  to  identify  and  implement  effective  remedial  measures. 

Many  studies  have  documented  the  salutary  effects  of  regular  and  moderate  physical  exercise  on 
obesity  (DiPietro,  1995;  Wilmore,  1996;  Stefanick,  1993).  According  to  Physical  Activity  and 
Health:  A  Report  of  the  Surgeon  General,  “Several  cross-sectional  studies  report  lower  weight,  BMI, 
or  skinfold  measures  among  people  with  higher  levels  of  self-reported  physical  activity  or  fitness” 
(U.S.  Department  of  Health  and  Human  Services,  1996). The  Surgeon  General  also  goes  on  to 
report  the  results  of  several  comprehensive  review  articles  and  meta-analyses  that  examined  the 
impact  exercise  training  had  on  body  weight  and  obesity. 

These  reviews  conclude  that — 

1.  physical  activity  generally  affects  body  composition  and  weight  favorably  by  promoting  fat 
loss  while  preserving  or  increasing  lean  body  mass; 

2.  the  rate  of  weight  loss  is  positively  related,  in  a  dose -response  manner,  to  the  frequency 
and  duration  of  the  physical  activity  session,  as  well  as  to  the  duration  (e.g.,  months,  years) 
of  the  physical  activityprogram;  and 

3.  although  the  rate  of  weight  loss  resulting  from  physical  activity  without  caloric  restriction 
is  relatively  slow,  the  combination  of  increased  physical  activity  and  dieting  appears  to  be 
more  effective  for  long-term  weight  regulation  than  is  dieting  done  (p.134). 

Researchers  continue  to  delve  into  the  causes  and  mitigation  strategies  for  the  prevalence  and 
severity  of  obesity  in  our  society.  Genetic  influences,  high-calorie  diets,  and  insufficient  energy 
expenditure  are  the  three  central  factors  being  explored.  Genetic  factors  have  been  examined 
through  studies  on  twins.  Even  when  reared  apart  from  one  another,  identical  twins  are  much  clos¬ 
er  in  body  weight  at  middle  age  than  are  fraternal  twins  or  siblings  (Nieman,  1998).  However,  a 
study  of  6,000  twins  in  Finland  found  that  lifestyle  factors  were  more  important  than  genetics  in 
determining  weight  gain  over  a  six-year  period  (Korkeila,  Kaprio.  Rissance,  &  Koskenvuo,  1995). 
The  combination  of  these  results  indicates  that  the  influence  of  lifestyle  can  be  stronger  than  the 
influence  of  genetics  in  addressing  body  weight.  It  is  widely  known  that  a  diet  consisting  of  an 
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excessive  number  of  calories  and  fat,  particularly  when  combined  with  inactivity,  contributes  to 
weight  gain.  (Leibel,  Rosenbaum  80  I  lirsch,  1995). 

Since  it  is  known  that  the  most  effective  intervention  for  obesity  is  the  simultaneous  increase 
of  physical  activity  and  reduction  of  caloric  intake  to  a  certain  degree  (Anderson  et  al.,  1999),  it  may 
seem  like  there  is  a  simple  solution.  However,  there  is  more  complexity  to  the  formula  than  simply 
calories  in  and  calories  out.  The  complexity  resides  not  only  in  determining  what  causes  the  body 
to  expend  the  most  calories  but  also  in  identifying  the  most  effective  approaches  to  encourage  us 
to  comply  with  these  simple  but  effective  interventions. 

The  basal  metabolic  rate  (BMR)  is  the  number  of  calories  expended  for  basic  bodily  functions 
such  as  digestion,  absorption,  transportation  of  bodily  fluids,  as  well  as  muscle  and  skin  repair.  It 
varies  from  individual  to  individual  due  to  known  factors  such  as  genetics  and  body  composition, 
and  researchers  continually  seek  other  unknown  determinants.  So  far,  increasing  muscle  mass  has 
been  shown  to  increase  the  BMR  and  starvation  can  reduce  it  (Leibel  et  al.,  1995).  For  the  pur¬ 
poses  of  this  chapter,  the  effect  of  physical  activity  on  body-fat  loss  will  be  assessed  for  its  effec¬ 
tiveness  in  reversing  the  problem  of  obesity. 

An  increase  in  muscle  mass  causes  an  increase  in  energy  expenditure  by  an  estimated  35  calo¬ 
ries  per  pound  of  muscle  per  day  (Campbell  et  al.,  1994).  This  implies  that  more  muscle  results  in 
greater  caloric  expenditure,  which  translates  into  less  caloric  storage  in  the  form  of  body  fat.  Many 
studies  have  shown  that  regular  physical  activity  maintains  or  increases  lean  body  mass,  expends 
energy,  and  helps  to  control  weight  (DiPietro,  1995;  DiPietro,  Kohl,  Barlow,  8c  Blair,  1999; 
Stefanick,  1993).  In  addition,  an  increase  in  muscle  actually  provides  more  fat-burning  tissue  and 
thus  increases  the  baseline  rate  of  caloric  expenditure.  The  standard  recommendation  by  govern¬ 
ment  health  experts  and  the  ACSM  is  to  engage  in  exercise  at  a  moderate  intensity  for  30  to  60 
minutes  at  least  three  days  out  of  the  week  (Pollock  et  al.,  1998). 

Muscle  or  lean  body  mass  is  normally  increased  by  strength  training,  which  has  become  increas¬ 
ingly  popular  among  health  enthusiasts  as  scientists  discover  its  many  benefits  including  weight 
loss.  Studies  have  shown  that  doing  strength-training  exercises,  such  as  lifting  weights,  has  a  pos¬ 
itive  effect  on  bone  density,  preserving  and  increasing  muscle  mass,  and  reducing  body  fat. 
(Campbell,  Crim,  Young,  8c  Evans,  1994).  The  current  recommendation  by  the  Surgeon  General 
is  to  perform  resistance-training  activities  at  least  twice  per  week.  “At  least  8  to  1 0  strength-devel¬ 
oping  exercises  that  use  the  major  muscle  groups  of  the  legs,  trunk,  arms,  and  shoulders  should  be 
performed  at  each  session,  with  one  or  two  sets  of  8  to  1 2  repetitions  of  each  exercise’"  (U.S. 
Department  of  Health  and  Human  Services,  1996).  Similar  guidance  has  been  provided  in  the 
ACSM  Fitness  Prescription  Position  Stand  (Pollock  et  al.,  1998). It  is  clear  that  physical  activity  is 
central  to  obtaining  positive  results. 

Knowing  the  results  of  the  studies  and  their  implications  for  better  health  and  quality  of  life  are 
merely  the  necessary  conditions,  not  the  sufficient  conditions,  for  dealing  with  overweight  and  obe¬ 
sity.  The  chances  that  an  obese  person  is  willing  or  able  to  engage  in  the  type,  duration,  and  inten¬ 
sity  of  activity  that  is  recommended  are  not  good.  A  recent  study  surveyed  almost  108,000  U.S. 
adults  to  examine  the  prevalence  of  attempts  to  lose  weight  and  the  strategies  that  were  imple¬ 
mented  (Serdula  et  al., 1999).  The  results  showed  weight  loss  as  a  common  concern,  with  28.8  per¬ 
cent  of  men  and  43.6  percent  of  women  trying  to  lose  weight.  However,  only  about  one-fifth  of  the 
men  and  women  reported  actually  implementing  the  recommendation  to  simultaneously  decrease 
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caloric  intake  while  engaging  in  regular  physical  activity  (150  minutes  per  week).  Even  with  such 
a  large  percentage  of  the  population  trying  to  decrease  their  weight,  the  mean  body  weight  of  U.S. 
adults  has  increased  by  7.6  pounds  during  the  past  15years  (Fine  et  al.,  1999).  The  indications  from 
these  results  are  that  people  either  do  not  know  or  do  not  adhere  to  the  proven  guidelines  for 
weight  loss.  Individual  compliance  is  the  major  factor  here,  as  it  is  with  all  medical  interventions, 
and  it  must  be  respected  as  a  contributor  to  overall  fitness. 

The  inability  of  Americans  to  maintain  a  healthy  weight  average  results  in  diminished  health- 
related  quality  of  life  and  vitality,  accordingto  the  investigation  by  Fine  et  al.,  (1999). Importantly, 
self-efficacy.image,  and  general  well-being  are  associated  with  increased  physical  fitness.  After  sur¬ 
veying  more  than  40,000  adult  women,  investigators  found  that  weight  gain  was  associated  with 
decreased  physical  function  and  vitality  and  increased  bodily  pain  regardless  of  baseline  weight. 
Weight  loss  on  the  other  hand  was  associated  with  improved  physical  function.  Even  with  the 
resulting  improvements  in  health,  most  people  in  the  United  States  are  not  taking  action  to  eradi¬ 
cate  the  problem  as  evidenced  by  our  increasingly  overweight  and  obese  population.  It  seems  that 
a  Catch-22  is  in  operation  here.  Because  of  the  decrease  in  physical  function  caused  by  weight  gain, 
it  is  difficult  for  many  overweight  people  to  be  physically  active  enough  to  begin  the  weight  reduc¬ 
tion  process.  A  high  level  of  body  fat  makes  activity  more  taxing,  uncomfortable,  and  frustrating, 
which  promotes  their  current  sedentary  lifestyle.  It  is  an  accelerating,  downward  spiral  that  results 
in  further  weight  gain  and  increased  susceptibility  to  illness  and  injury.  The  most  important  chal¬ 
lenge  to  our  nation's  leading  health  experts  is  to  first  of  all  educate  our  people  to  the  benefits  of 
weight  loss  resulting  from  exercise  in  terms  of  longevity  and  quality  of  life. 

Thus,  it  appears  that  promoting  awareness  of  the  efficacy  of  lifestyle  activity  as  an  alternative 
to  structured  aerobic  exercise  could  be  effective  in  controlling  the  rate  of  overweight  and  obesity 
mainly  by  making  compliance  with  the  intervention  more  palatable  to  the  majority  of  the  popula¬ 
tion  at  risk.  Implementing  the  results  of  these  studies  that  suggest  lifestyle  changes  and  small  incre¬ 
ments  of  activity  throughout  the  day  may  encourage  otherwise  inactive  people  to  begin  to  engage 
in  a  more  physically  oriented  lifestyle. The  problem  of  an  overly  fat  nation  is  obviously  significant, 
as  it  exacerbates  multiple  health  conditions  already  present.  If  obesity  is  proactively  addressed  and 
eradicated,  the  onslaught  of  many  health  problems  can  be  avoided.  These  are  powerful  findings  in 
favor  of  adopting  a  set  of  activities  that  may  easily  become  part  of  our  daily  activity  repertoire. 


Hype 


Blood  pressure  is  “...the  product  of  cardiac  output  and  peripheral  vascular  resistance” 
(American  College  of  Sports  Medicine,  1994).  This  refers  to  the  pressure  exerted  as  the  heart 
pumps  blood  through  the  veins,  and  is  measured  in  milliliters  of  mercury  (mm  Hgj.It  is  expressed 
in  two  numbers,  systolic  blood  pressure  (higher  number)  and  diastolic  blood  pressure  (lower  num¬ 
ber).  The  systolic  pressure  is  exerted  by  the  blood  being  forced  against  the  walls  of  the  arteries  dur¬ 
ing  the  contraction  of  the  heart.  Diastolic  pressure  occurs  during  the  relaxation  phase  of  the  heart, 
when  the  blood  is  again  pushed  against  the  artery  walls.  Hypertension  is  defined  as  a  blood  pres¬ 
sure  reading  of  140/90  or  greater,  with  160/100  being  classified  as  severe. 
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Based  on  estimates  made  in  1996  by  the  American  Heart  Association  (AHA),  nearly  50  mil¬ 
lion  Americans  are  hypertensive.  Each  year,  high  blood  pressure  kills  over  37,000  Americans  and 
it  contributes  to  over  700,000  deaths  (NIH,  1998). The  National  Heart,  Lung,  and  Blood  Institute 
reports  that  when  left  untreated,  high  blood  pressure  can — 

Cause  the  heart  to  get  larger,  which  may  lead  to  heart  failure 

•  Cause  small  blisters  (aneurisms)  to  form  in  the  brain’s  blood  vessels,  which  may  cause  a  stroke 

•  Cause  blood  vessels  in  the  kidney  to  narrow,  which  may  cause  kidney  failure 

•  Cause  arteries  throughout  the  body  to  harden  faster,  especially  those  in  the  heart,  brain,  and 
kidneys,  which  can  cause  a  heart  attack,  stroke,  or  kidney  failure. 

Studies  have  found  that  high  blood  pressure  also  affects  the  brain.  When  people  have  high 
blood  pressure  during  middle  age,  they  are  more  likely  to  experience  cognitive  problems  25  years 
later. This  means  that  one’s  ability  for  memory,  problem-solving,  concentration,  andjudgment  dur¬ 
ing  old  age  is  impaired  (Launeret  al.,  1995).  According  to  the  ACSM,  “...individuals  with  chron¬ 
ically  elevated  blood  pressure  have  an  increased  probability  of  stroke,  coronary  artery  disease,  and 
left  ventricular  hypertrophy”  (1996).  Fortunately,  high  blood  pressure  can  be  controlled.  Unlike 
obesity,  effective  medications  can  treat  hypertension  if  taken  daily.  Unfortunately,  these  medica¬ 
tions  are  expensive,  can  have  adverse  side  effects,  and  require  daily  administration.  To  help  prevent 
and  control  blood  pressure,  the  National  Institutes  of  Health  (NIH,  1998)  recommends  that  all 
people  change  their  lifestyle  behaviors  in  the  following  ways — 

•  Lose  weight  if  overweight 

•  Reduce  sodium  intake  to  less  than  2,300  mg  per  day 

’  Maintain  adequate  dietary  potassium  intake  (fruits  and  vegetables) 

•  Limit  alcohol  intake 

•  Exercise  regularly 

NIH  urges  those  who  are  hypertensive  to  implement  the  above  recommendations  for  three  to  six 
months  before  starting  drug  therapy.  Such  recommendations  are  indicative  of  the  powerful  influence 
that  lifestyle  changes,  including  regular  physical  activity, can  have  on  disease  prevention.  So  far  it  has 
become  clear  that  hypertension  is  a  serious  though  preventable  and  treatable  health  issue.  As  with 
obesity,  exercising  at  a  moderate  intensity  for  at  least  30  minutes  each  day  on  most  days  of  the  week 
seems  to  have  a  positive  effect  on  reducing  hypertension. 

The  degree  to  which  exercise  can  help  in  preventing  and  reducing  hypertension  has  been  shown 
in  several  studies  (Blair  et  al.,  1 984;  Folsom,  Kushi,  6c  Hong,  1996;Kokkinos,  et  al.,  1998).  Rueckert, 
Slane,  Lillis,  6c  Hanson  (1996)  has  shown  that  there  is  a  20  to  50  percent  greater  risk  for  developing 
hypertension  in  inactive  people  when  compared  with  those  who  are  active.  She  collected  data  on  18 
patients  with  high  blood  pressure  before,  during,  and  after  they  exercised.  She  found  that  walking  on 
the  treadmill  for  45  minutes  decreased  their  blood  pressure  below  resting  levels  for  up  to  two  hours 
after  they  finished  walking.  In  other  research,  Kokkinos  et  al.,  (1998)  spent  sixteen  weeks  examining 
46  African-American  men  who  were  severely  hypertensive  and  the  effects  of  either  anti-hypertensive 
medication  alone  or  the  medication  combined  with  moderate  exercise  (1995). The  researchers  found 
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a  significant  decrease  in  the  diastolic  pressure  of  the  exercising  group,  from  88  to  83  mm  Hg.,  and  an 
increase  in  the  diastolic  pressure  in  the  medication-only  group,  from  88  to  90  mm  Hg.  (P=0.002). 
They  continued  to  monitor  the  subjects  for  36  weeks  and  found  substantialreductions  in  the  diastolic 
blood  pressure  of  the  exercising  group  even  after  reducing  their  medications. 

Because  the  research  demonstrates  a  return  to  the  pre-exercise  blood  pressure  level  soon  after  a 
person  stops  exercising,  the  ACSM  recommends  that  hypertensives  engage  in  frequent  physical 
activity  and  incorporate  this  exercise  as  a  permanent  adjustment  to  their  lifestyles.  This  recommen¬ 
dation  is  coupled  with  one  that  emphasizes  aerobic  activity  rather  than  weight  training.  If  weight 
training  is  to  be  used  with  aerobic  exercise,  the  ACSM  suggests  a  modification  to  the  weight-lifting 
guidelines  to  include  10  to  15  repetitions  (rather  than  8  to  12)  during  each  weight  training  exercise. 
Over  time  with  continued  activity,  the  blood  vessels  relax,  creating  a  long-term  lowering  of  the  blood 
pressure.  By  adhering  to  the  guidelines  for  regular,  moderate  aerobic  activity,  the  research  thus  far 
seems  to  show  that  hypertensives  can  positively  change  their  blood  pressure  and  risk  of  mortality. 


ro  n  art  Disease  and  Cardiovascular  Disease 

Cardiovascular  disease  (CVD)  is  the  number  one  killer  in  the  United  States  today.  There  were 
an  estimated  954,407  deaths  in  1996 resulting  from  coronary  heart  disease  (CHD)  and  stroke  com¬ 
bined  (NIH,  1996).  This  total  is  41.2  percent  of  all  deaths  in  that  year,  and  its  magnitude  must 
result  in  a  continuing  and  diligent  investigation  of  approaches  leading  to  its  prevention  and  con¬ 
trol.  The  Surgeon  General  reports  that  ",  ..reviews  of  epidemiological  literature  have  concluded  that 
physical  activity  is  strongly  and  inversely  related  to  CVD  risk”  (1996).  In  other  words,  the  more 
physical  activity  one  engages  in,  the  lower  one's  risk  of  CVD.  In  addition,  the  correlation  between 
inactivity  and  CVD  has  been  repeatedly  examined,  with  findings  of  a  direct  relationship  so  con¬ 
vincing  that  inactivity  is  now  listed  as  a  risk  factor  for  developing  CVD  (NIH,  1996). 

There  have  been  many  studies  examining  the  relationship  between  the  dose-response  relationship 
between  exercise  and  risk  of  CVD  (Kannel  8c  Sorlie,  1979;  Paffenbarger  et  al.,  1984;  Kannel, 
Belanger,  Dagostino,  8c  Israel,  1986).  One  such  study  at  the  Cooper  Institute  of  Aerobics  Research 
in  Dallas,  Texas,  investigated  the  relationship  between  cardiovascular  fitness  levels  and  CVD. 
Included  in  this  study  were  25,341  male  Cooper  Clinic  patients  who  underwent  a  maximal  graded 
stress  test  and  then  were  tracked  for  long-term  follow-up. There  were  226  cardiovasculardeaths  dur¬ 
ing  the  follow-up  years.  After  accounting  for  other  CVD  predictors  (high  blood  pressure,  smoking, 
and  high  blood  cholesterol),  the  researchers  found  a  significant,  inverse  correlation  between  fitness 
levels  and  CVD  in  subjects  with  no  other  predictors  (P=0.001).The  authors  estimate  that  20  percent 
of  the  226  CVD  deaths  were  attributed  to  low  fitness  level  ( Farrell  et  al.,  1998). 

The  evidence  repeatedly  suggests  that  regular  physical  activity  protects  against  the  development 
of  CVD  (Bouchard  et  al.,  1994;  Haskel  et  al.,  1992). This  inoculation  effect  is  due  to  the  primary 
effects  of  exercise  on  improving  cardiovascular  health  and  to  the  favorable  effects  of  physical  activi¬ 
ty  on  other  CVD  risk  factors,  such  as  high  blood  pressure,  blood  lipid  levels,  insulin  resistance,  and 
obesity  (U.S.  Department  of  Health  and  Human  Services,  1996).  Risk  factors  are  defined  as  "...per¬ 
sonal  habits  or  characteristics  that  medical  research  has  shown  to  be  associated  with  an  increased  risk 
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of  heart  disease”  (Nieman,  1998).  In  1992  the  American  Heart  Association  added  physical  inactiv¬ 
ity  to  the  list  of  “'major  risk  factors  that  can  be  changed.”  The  list  also  includes  cigarette  smoking, 
high  blood  pressure,  and  high  blood  cholesterol. The  number  of  Americans  with  each  of  these  three 
risk  factors  represents  20  percent  to  25  percent  of  the  population.  However,  the  number  of 
Americans  with  inactivity  as  a  risk  factor  is  60  percent  (American  Heart  Association,  1996). 

By  controlling  the  risk  factors  for  heart  disease  it  is  thought  that  up  to  90  percent  of  the  occur¬ 
rences  of  this  disease  could  be  prevented.  Combining  that  conclusion  with  the  results  of  numerous 
research  studies  on  the  positive  health  effects  of  exercise,  one  can  deduce  that  regular  physical  activ¬ 
ity  could  have  prevented  approximately  859,000  deaths  in  1996. The  question  then  is  which  activ¬ 
ities  are  best  suited  for  increasing  cardiovascular  health  and  preventing  the  risk  factors  for  and  the 
incidence  of  CVD.  Although  identifying  the  optimal  types  of  exercises  that  are  effective  for  differ¬ 
ent  classes  of  individuals  is  very  important,  the  most  critical  question  is  why  most  Americans  are 
not  exercising  in  light  of  its  proven  benefits.  The  answer  lies  in  the  psychology  of  compliance. 

According  to  the  collective  findings  reported  by  the  Surgeon  General,  “Activity  that  reduced 
CVD  risk  factors  and  confers  many  other  health  benefits  does  not  require  a  structured  or  vigorous 
exercise  program.  The  majority  of  benefits  of  physical  activity  can  be  gained  by  performing  mod¬ 
erate-intensity  activities”  (1996).  These  findings  indicate  that  moderate-intensity  activities  confer 
significant  health  benefits,  but  such  activity  must  be  performed  frequently.  Fletcher  et  al.  (2000) 
reported  that  the  training  effect  of  frequently  engaging  in  activities  such  as  biking,  jogging,  swim¬ 
ming,  brisk  walking,  hiking,  climbing  stairs,  aerobic  exercise,  tennis,  soccer,  and  basketball,  to  name 
a  few,  are  especially  beneficial.  When  these  activities  make  the  heart  rate  exceed  40  percent  to  50 
percent  of  its  maximal  capacity,  they  are  most  effective.  The  activities  that  are  considered  low  to 
moderate  in  intensity,  ranging  from  40  percent  to  60  percent  of  maximum  capacity  include  house¬ 
work,  gardening,  dancing,  and  leisure  walking.  When  performed  daily,  the  health  benefits  of  these 
activities  are  long  term  and  predict  a  lower  risk  of  cardiovascular  disease  (1996). 

In  a  more  specific  examination  of  how  much  one  needs  to  walk  to  reap  its  protective  benefits, 
Sesso,  Paffenbarger,  Ha,  and  Lee  investigated  l,564women  (mean  age  45.5  years),  initially  free  of 
CVD.  The  data  were  collected  from  1962  until  1993. The  authors  looked  at  the  calories  expended 
in  various  activities  such  as  number  of  stairs  climbed,  blocks  walked,  and  sports  played.  They  then 
divided  those  data  into  approximate  thirds  (<500,  500—999,  1000  or  >  kcal/week)  to  develop  a 
quantitative  dependent  measure  of  fitness.  During  those  years,  181  cases  of  CVD  were  identified. 
The  researchers  adjusted  for  other  coronary  risk  factors  and  body  mass  index  (BMI),  and  then  com¬ 
pared  the  three  “kcal  expended”  in  terms  of  CVD  risk.  The  results  showed  a  33  percent  decrease  in 
CVD  risk  for  those  women  who  walked  at  least  lOblocks  per  day  (approximately  6  miles  per  week). 
In  addition,  there  was  an  inverse  association  between  lower  BMI  (<23  kg/m2)  and  CVD  (1999). 

When  considering  the  broad  and  highly  beneficial  health  effects  of  partaking  in  an  active 
lifestyle,  it  is  astounding  to  note  that  only  15percent  of  adults  performed  the  recommended  amount 
of  physical  activity  in  1997,  and  40  percent  of  adults  did  not  engage  in  any  leisure-time  physical 
activity  (U.S.  Department  of  Health  and  Human  Services, 2000).  Knowing  the  difficulty  that  many 
Americans  have  in  meeting  the  30-minute  standard,  three  or  more  times  per  week,  for  general  health 
and  fitness,  experts  have  offered  an  alternative  approach  (Pate  et  al.,  1995).  Their  hypothesis  is  that 
more  people  will  find  ways  to  be  active  once  they  know  that  short  periods  of  exercise  (10  minutes) 
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a  few  times  each  day  can  protect  against  heart  diseasejust  as  well  as  longer  periods  once  a  day.  This 
approach  may  be  more  realistic  to  implement  and  has  been  proven  to  be  effective. 


Diabetes 


Diabetes  is  characterized  by  the  inability  of  the  human  body  to  regulate  its  balance  of  glucose 
and  insulin.  The  condition  requires  a  consistent  and  stringent  lifestyle  that  dictates  specific  eating 
times,  type  of  diet,  physical  activity, blood  glucose  monitoring,  and  insulin  injections  as  individual¬ 
ly  necessary. The  people  who  are  at  the  highest  risk  of  developing  diabetes  are  the  ones  with  a  high 
body  mass  index,  especially  if  they  are  inactive.  A  study  (Helmrich  et  al.,  1991)  examined  nearly 
6,000  men  for  1 4years,  measuring  their  leisure-time  physical  activity  (expressed  as  calories  expend¬ 
ed  per  week).  The  men  who  were  both  obese  and  inactive  were  four  times  more  likely  to  develop 
non-insulin  dependent  diabetes  (NIDDM)  than  the  lean  and  active  men  were.  In  addition,  the 
authors  found  that  for  each  500-calorie-per-week  increase  in  activity  expenditure,  there  was  a  6 
percent  reduction  in  the  risk  of  NIDDM. 

The  relationship  between  a  sedentary  lifestyle  and  the  incidence  of  diabetes  has  been  observed 
in  other  countries  as  they  adopt  Westernized  or  technologically  advanced  lifestyles. Those  coun¬ 
tries  experienced  major  increases  in  the  prevalence  of  NIDDM  (West,  1978).  Such  findings  are 
further  supported  by  studies  comparing  individuals  who  moved  from  their  native  countries  to  more 
technologically  advanced  societies  with  their  ethnic  counterparts  who  remained  in  their  homeland. 
The  incidence  of  diabetes  was  much  greater  in  those  who  moved  (Ravussin,  Valencia,  Esparza, 
Bennett,  &  Schulz,  1994). 

Another  major  six-year  study  of  almost  7,000  Japanese-American  men  in  Hawaii  found  that 
the  rate  of  exhibiting  the  symptoms  of  NIDDM  was  lowest  in  the  most  active  men,  even  after 
adjusting  for  obesity,  age,  family  history,  and  other  factors  that  contribute  to  NIDDM  (Burchfiel 
et  al.,  1994).  Those  who  were  the  least  active  had  a  53.9  percent  incidence  rate,  while  the  most 
active  men  had  a  21.7  percent  rate.  This  finding  leads  to  the  necessity  to  identify  and  quantify  a 
threshold  of  activity  that  may  protect  against  diabetes. 

A  group  of  scientists  at  Harvard  conducted  research  to  determine  the  benefits  of  moderate-inten¬ 
sity  activity,  such  as  walking,  as  opposed  to  vigorous  activities  (Hu  et  al.,  1999)  with  regard  to  miti¬ 
gation  of  diabetes. They  examined  this  relationship  through  a  prospective  cohort  study  that  included 
detailed  data  from  more  than  70,000  women  in  11U.S.  states  who  were  free  of  diabetes,  cardiovas¬ 
cular  disease,  and  cancer.The  researchers  got  updates  in  1986, 1988,  and  1992.  During  the  eight  years 
of  follow-up,  1,4 19  incidences  of  type  II  diabetes  were  reported.  The  results  from  this  statistical  analy¬ 
sis  (adjusting  for  covariates)  found  that  a  faster-than-usual  walking  pace  was  independently  associat¬ 
ed  with  decreased  risk.  The  researchers  also  discovered  that  equivalent  energy  expenditure,  whether 
through  walking  or  more  vigorous  activity,  resulted  in  comparable  magnitudes  of  risk  reduction. 

More  research  needs  to  be  done  for  the  evidence  presented  thus  far  to  be  substantiated. 
However,  the  link  between  obesity  and  diabetes  is  quite  clear  and  there  is  an  evident  link  between 
physical  activity  and  obesity.  With  all  of  the  research  presenting  the  positive  impact  of  moderate- 
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intensity  exercise  on  these  conditions,  it  is  apparent  that  the  current  recommendation  for  regular 
activity  may  also  decrease  the  risk  for  diabetes  as  well. 


Osteoporosis 


Osteoporosis  is  characterized  by  a  loss  of  bone  mass,  deterioration  of  bone  tissue,  increasing 
bone  fragility,  and  increased  likelihood  of  fractures.  According  to  the  Surgeon  General,  osteoporo¬ 
sis  affects  mostly  older  persons  and  is  more  common  among  women  than  among  men  (1996). This 
is  due  to  the  fact  that  women  have  a  lower  peak  bone  mass  than  men  do,  they  lose  bone  mass  at  an 
accelerated  rate  when  estrogen  levels  decline  (usually  after  menopause  ),  and  they  have  a  longer  life 
span  than  men.  In  both  men  and  women  however,  the  three  general  reasons  for  developing  osteo¬ 
porosis  are  as  follows:  a  deficient  level  of  peak  bone  mass  at  physical  maturity,  failure  to  maintain 
this  bone  mass  during  the  third  and  fourth  decade  oflife,  and  the  decline  in  bone  mass  that  occurs 
during  the  fourth  or  fifth  decade  oflife.  The  Surgeon  General  also  reports  that,  “Physical  activity 
may  positively  affect  all  three  of  these  factors”  (1996). 

For  bones  to  maintain  their  structure,  they  must  have  force  applied  to  them.  According  to  Dr. 
David  Nieman,  author  of  The  Exercise  Health  Connection ,  "Healthy  individuals  who  undergo  com¬ 
plete  bed  rest  for  4  to  36  weeks  can  lose  an  average  of  lpercent  bone  mineral  content  per  week,  while 
astronauts  in  a  gravity-free  environment  can  lose  bone  at  a  monthly  rate  as  high  as  one  to  four  per¬ 
cent  depending  on  the  type  of  bone”  (1996).  Kirchmer,  Lewis,  and  O’Connor  (1996)  conducted  an 
extensive  inquiry  into  the  effects  of  exercise  on  bone  mass  and  concluded  that  young  adults  who  are 
athletic  have  a  higher  boner  density  than  those  who  are  sedentary.  However,  exercise  is  not  the  only 
factor  contributing  to  bone  strength.  Hormones,  diet,  medications,  disease,  family  history,  race,  gen¬ 
der,  and  age  are  all  related  to  bone  density.  Similar  to  the  salutary  effects  of  exercise  for  other  dis¬ 
eases,  exercise  may  help  to  prevent  or  offset  bone  mass  reduction  no  matter  at  what  age  one  begins. 

After  conducting  research  with  young  adults,  experts  determined  that  physical  activity  plays  a 
significant  role  in  developing  and  maintaining  bone  mass.  There  also  seems  to  be  a  compelling  rela¬ 
tionship  between  an  increase  in  muscular  strength  and  an  increase  in  bone  density.  When  young 
female  athletes  were  tested  for  bone  density Tresearchers  found  the  greatest  density  for  those  who 
engaged  injumping  and  short  bursts  ofpowerful  movement,  such  as  one  exhibitswhen  playing  bas¬ 
ketball  or  volleyball.  Interestingly,  swimmers,  who  exercisedin  a  weightless  environment,  had  a  very 
similar  bone  density  compared  with  those  who  were  sedentary  (Nieman,  1996).  Some  researchers 
have  observed  a  link  between  a  history  of  lifelong  physical  activity  and  greater  bone  mineral  mass 
as  one's  age  advances  (Snow,  1996).  This  positive  effect  of  physical  activity  results  in  fewer  inci¬ 
dences  of  hip  fracture  in  older  individuals. 

Bone  mass  and  strength  naturally  decline  with  age  (Cummings  1985).  Researchers  also  discov¬ 
ered  that  by  the  age  of  90,  one-third  of  all  women  and  one-sixth  of  all  men  have  sustained  a  hip 
fracture.  The  impact  of  hip  fractures  is  extensive,  accounting  for  more  deaths,  permanent  disabili¬ 
ty,  and  medical  institutional  care  costs  than  all  other  osteoporotic  fractures  combined  (U.S. 
Department  of  Health  and  Human  Services,  1996).  The  risk  of  falling  combined  with  the  impact 
of  the  fall  and  the  strength  of  the  bone  are  all  factors  determining  hip  fracture  risk.  Researchers 
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suggest  that  exercise  may  have  a  twofold  effect  on  such  a  risk — decreasing  the  incidence  and  sever¬ 
ity  of  falls  (more  muscle  tone  and  perhaps  better  balance)  and  increasing  the  quantity  and  quality 
of  mineral  in  the  bones  (Smith  ScTommerup,  1995).  Regardless  of  gender,  age,  or  status,  exercise 
reduces  bone  loss  and  increases  bone  mass,  much  like  the  effect  of  exercise  on  muscle. 

The  studies  conducted  on  postmenopausal  women  conclude  that  bone  mineral  density  is  corre¬ 
lated  with  muscle  strength  (Sinaki.McPhee,  Hodsdon,  Merrit,  Sc  Offord,  1998). Unfortunately,  the 
positive  response  of  bone  tissue  to  exercise  is  reversible,  which  indicates  the  need  for  continual  activ¬ 
ity  throughout  a  person’s  adult  life.  Such  activity  should  include  a  moderate  amount  of  weight-bear¬ 
ing  aerobic  exercise  and  resistance  training.  A  study  conducted  at  Tufts  University  examined  39 
postmenopausal  women  who  engaged  in  intensive  weight  training  for  45  minutes  two  times  a  week 
(Nelson  et  al.,  1994).  Compared  with  control  subjects  who  were  sedentary,  the  exercisers  signifi¬ 
cantly  improved  their  muscle  mass  as  well  as  their  bone  density.  The  combination  of  hormones  and 
exercise  seems  to  have  a  very  positive  effect  on  bone  density  in  postmenopausal  women.  This  is  illus¬ 
trated  by  a  study  performed  in  Australia  with  120 postmenopausal  women.  The  researchers  exam¬ 
ined  the  forearm  bone  (a  bone  not  affected  by  aerobic  activity)  in  response  to  exercise  and  estrogen. 
The  women  who  did  aerobics  only  (without  weight  training)  did  not  show  any  effect  on  bone  den¬ 
sity.  However,  the  women  who  combined  exercise  and  estrogen  had  significant  improvements  in 
bone  density  (Nieman  1998).  Such  results  stress  the  importance  of  specificity  relating  to  the  impact 
of  exercise  to  bone  and  the  importance  of  hormones  in  overall  bone  development. 

On  the  basis  of  many  studies  such  as  the  ones  described,  the  ACSM  has  deduced  the  follow¬ 
ing  five  principles  of  an  exercise  program  to  effectively  prevent  or  treat  osteoporosis  — 

Principle  of  specificity.  If  the  leg  bones  are  stressed  by  running  and jumping,  then  the  arm  bones  will 
not  benefit  unless  they  too  are  stressed  with  specific  exercises  (e.g.,  weight  lifting). 

c  Principle  d  overload.  For  a  bone  to  improve  its  density  and  strength,  the  exercise  stress  must  exceed 
normal  levels. 

•  Principle  of  reversibility.  Thepositive  effect  of  an  exercise  program  on  the  skeleton  will  be  lost  if 
the  pro  gram  is  stopped. 

•  Principle  of  initial  values.  People  with  the  lowest  levels  of  bone  density  and  strength  will  experi¬ 
ence  more  improvement fom  an  exerciseprogram  than  those  with  normal  or  above -normal  bone 
density. 

•  Principle  of  diminishing  returns.  Each  person  has  an  individual  genetic  ceiling  that  limits  the 
gains  in  bone  mass.  As  the  ceiling  is  approached,  gains  in  bone  mass  will  slow  and  plateau,  (p.  3) 

If  people  participate  in  the  recommended  amount  of  activity,  there  can  be  a  marked  reduction 
in  the  prevalence  and  severity  of  osteoporosis. 


Cancer 


Researchers  have  been  investigating  the  benefits  of  exercise  for  preventing  cancer  for  decades. 
There  have  been  many  studies  that  support  an  inverse  relationship  between  exercise  and  some  types 
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of  cancer.  The  most  positive  results  with  exercise  have  been  seen  in  the  prevention  of  colon  cancer 
and  breast  cancer.  In  fact,  studies  done  by  the  Institute  for  Aerobic  Research  have  shown  that  over 
an  eight-year  period,  the  overall  cancer  death  rate  was  four  times  greater  for  physically  unfit  men 
than  it  was  for  the  most  fit  men  (Nieman,  1998).  All  types  of  cancer  are  thought  to  be  80  percent 
preventable  through  lifestyle  and  environmental  factors,  including  diet,  cessation  from  smoking  and 
tobacco  use,  reduction  of  environmental  hazards,  and  refraining  from  excessive  alcohol  use  ( Hoeger 
&c  Hoeger,  1998).  Physical  activity  has  many  benefits  but  one  interesting  adjunct  is  the  major 
change  in  the  way  people  live  when  they  are  more  physically  active.  By  adopting  a  lifestyle  that 
includes  regular  physical  activity,  people  may  be  more  likely  to  choose  other  healthy  habits.  These 
additional  fitness-related  behaviors,  such  as  selecting  a  more  healthy  diet,  will  in  turn  reduce  other 
risk  factors  and  resolve  other  health  problems  (e.g.,  obesity).  The  combined  effects  of  these  posi¬ 
tive  habits  may  further  reduce  the  likelihood  of  developing  cancer. 

The  American  Cancer  Society  and  the  National  Cancer  Institute  have  published  a  report  stat¬ 
ing  that  those  living  a  healthy  lifestyle  have  some  of  the  lowest  cancer  mortality  rates  ever  report¬ 
ed  in  scientific  studies  (American  Cancer  Society,  1986). There  have  been  a  number  of  additional 
studies  correlating  physical  activity  with  some  mitigating  effects  against  cancer.  A  few  of  these 
studies  have  involved  injecting  mice  with  cancer-causing  chemicals  and  then  dividing  the  mice  into 
exercise  and  non-exercise  groups.  The  mice  were  then  examined  for  the  time  and  size  of  apparent 
cancer.  In  one  particular  study  mice  were  consistently  capable  of  clearing  certain  types  of  cancer 
after  only  nine  weeks  of  exercise.  Their  inactive  counterparts  could  not.  This  may  be  due  to  the  abil¬ 
ity  of  the  body’s  macrophages  to  clear  cancer  cells  for  several  hours  after  exercise. 

The  positive  effects  of  exercise  on  the  incidence  and  severity  of  breast  cancer  has  been  repeat¬ 
edly  studied.  Sesso,  Paffenbarger,  and  Lee  (1998)  studied  1,566  alumnae  from  the  University  of 
Pennsylvania  who  were  cancer  free  between  the  years  of  1962  to  1993. They  established  a  physical 
activity  baseline  for  each  participant  by  asking  the  women  which  types  of  activities  they  engaged  in 
and  how  often.  They  then  divided  the  participants  into  three  caloric  expenditure  categories;  <500 
kcal/week,  500-999  kcal/week,  and  >999  kcal/week.  Their  follow-up  questionnaires  found  109 
cases  of  breast  cancer  during  35,365  person-years.  After  adjusting  for  age  and  body  mass  index 
( BMI),  they  discovered  a  significant  effect  of  exercise  on  reducing  the  rate  of  breast  cancer  in  post¬ 
menopausal  women,  but  not  in  premenopausal  women.  They  concluded  that  physical  activity  and 
breast  cancer  have  a  significant,  inverse  relationship  among  postmenopausal  women. 

Another  study  examined  the  responses  of  25,624  women  who  filled  out  survey  questionnaires 
about  their  leisure-time  and  work  activity  (Thune,  Brenn,  Lund,  &.  Gaard,  1997).  Over  the  course  of 
13.7  follow-up  years,  researchers  identified  351  cases  of  breast  cancer  among  the  women.  They  found 
an  inverse  association  between  those  engaging  in  more  leisure-time  activity  and  the  incidence  of 
breast  cancer.  In  contradiction  to  the  study  cited  above,  Thune  et  al.  found  a  greater  reduction  in  risk 
among  regularly  exercising,  premenopausal  women  than  postmenopausal  women,  and  in  younger 
(<45  years  of  age)  than  in  older  women.  In  stratified  analysis  the  risk  of  breast  cancer  was  the  lowest 
in  lean  women  (BMI  <  22.8)  who  exercised  at  least  four  hours  per  week.  Those  with  a  higher  activi¬ 
ty  level  also  had  reduced  risk  and  the  effect  was  again  more  pronounced  among  premenopausal 
women.  The  conclusion  the  researchers  report  is  that  physical  activity,  both  during  leisure  time  and  at 
work,  is  associated  with  a  reduced  risk  of  breast  cancer  primarily  in  premenopausal  women. 
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More  than  30  studies  have  been  published  that  have  investigated  leisure-time  and  work-time 
physical  activity  in  relation  to  colon  cancer.  “Three  fourths  of  these  studies  showed  that  physically 
active  compared  to  inactive  people  have  less  colon  cancer.”  A  frequent  finding  is  that  people  who 
tend  to  sit  for  most  of  their  workday  or  remain  inactive  in  their  leisure  time  have  a  30  to  100  per¬ 
cent  greater  risk  of  contracting  colon  cancer  (Nieman,  1998)  than  their  more  active  counterparts. 
The  Surgeon  General  reports  on  1 8  studies  conducted  in  a  variety  of  populations,  including  China, 
Denmark,  Japan,  New  Zealand,  Sweden,  Switzerland,  Turkey,  and  the  United  States  (1996). 
“Fourteen  studies  reported  a  statistically  significant  relationship  between  occupational  physical  activ¬ 
ity  and  risk  of  colon  cancer.. (p.  1 13).  In  eight  study  populations  an  inverse  association  was  report¬ 
ed  between  physical  activity  and  risk  of  colon  cancer  and  results  were  usually  consistent  for  men  and 
women.  Three  studies  that  examined  the  effects  of  physical  activity  during  early  adulthood  found  no 
evidence  to  indicate  that  earlier  activity  did  not  affect  risk  of  colon  cancer  later  in  life. 

The  number  of  different  kinds  of  cancers  and  the  difficulty  in  ascertaining  a  directly  significant 
cause-and-effect  relationship  between  activity  and  cancers  makes  it  difficult  to  predict  the  protec¬ 
tive  effect  of  exercise.  The  studies  on  colon  cancer  and  breast  cancer  clearly  indicate  that  there  is 
some  link  between  consistent  activity  and  a  reduced  incidence  of  cancer.  The  general  guidelines  to 
exercise  moderately  and  regularly  seem  to  provide  at  least  some  protection,  especially  when  com¬ 
bined  with  other  healthy  habits  such  as  a  good  diet  and  general  health  care. 


Clinical  Depression 


According  to  a  report  by  the  National  Institute  of  Mental  Health  (1999),  depression  strikes 
more  than  17  million  Americans  each  year.  This  number  is  greater  than  the  number  of  cases  of 
coronary  heart  disease,  cancer,  or  AIDS.  The  most  troubling  statistic  is  that  15  percent  of 
depressed  people  commit  suicide. 

In  1996,  the  Surgeon  General  reported  that  “Epidemiological  research  among  men  and  women 
suggests  that  physical  activity  may  be  associated  with  reduced  symptoms  of  depression.  In  general, 
persons  who  are  inactive  are  twice  as  likely  to  have  symptoms  of  depression  than  are  more  active  per¬ 
sons”  (US.  Department  of  Health  and  Human  Services,  1996).  Physical  activity  has  been  associated 
with  improved  mood  and  reduced  anxiety  right  after  and  for  up  to  several  hours  after  an  exercise  ses¬ 
sion  (Nieman,  1998). The  implications  for  exercise  in  reducing  depression  include  improved  feelings 
of  self-esteem,  increased  social  interaction,  relief  from  routine  stresses,  and  brain  chemical  alterations. 
Any  one  of  these  factors  or  a  combination  of  them  may  contribute  to  an  enhanced  mood  state. 

Approximately  two-thirds  of  the  people  suffering  from  depression  do  not  get  professional  help. 
There  are  a  variety  of  possible  reasons  for  such  a  lack  of  action.  People  may  be  too  embarrassed  or 
ashamed  of  feeling  depressed,  they  may  attribute  their  symptoms  to  other  lifestyle  factors  such  as 
poor  diet,  or  they  may  feel  too  tired  to  bother.  If  left  untreated,  depression  can  result  in  years  of 
misery  and  possibly  self-inflicted  injury  or  death.  The  cost  of  depression  is  quite  high,  and  esti¬ 
mated  $43  billion  per  year  due  to  lost  work  hours,  lost  productivity,  and  medical  costs  (Nordenberg, 
1998).  Although  clinical  depression  may  require  more  than  one  intervention  (e.g.,  counseling  and 


234 


Appendix  A:  Overweight  and  Obesity 


antidepressant  medication),  exercise  may  be  a  simple  and  effective  way  to  at  least  somewhat  con¬ 
trol  depressive  symptoms. 

Although  research  has  shown  a  connection  between  exercise  and  reduced  feelings  of  tension  and 
anxiety  (APA,  1998),  the  question  of  how  much  exercise  is  effective  for  such  results  still  remains. 
Because  of  the  characteristic  tiredness,  lethargy,  and  disinterest  in  activity  associated  with  depression, 
it  is  more  probable  that  depressed  persons  would  engage  in  minimal  rather  than  vigorous  activity  for 
the  relief  of  their  symptoms.  Many  studies  have  examined  the  amount  and  type  of  exercise  needed 
for  decreasing  depression.  In  a  meta-analysis  of  1 04  studies  of  3,048  subjects,  some  very  interesting 
dimensions  of  the  effects  of  exercise  on  anxiety  were  documented  and  are  summarized  below  — 

Training  programs  usually  need  to  exceed  10  weeks  before  significant  changes  in  long-termanxiety  occur. 

•  Exercise  of  at  least  20-minutes  duration  seems  necessary  to  achieve  reductions  in  bothpresent  and 
long-term  anxiety. 

•  Reductions  in  both  present  and  long-term  anxiety  occur  after  aerobic  but  not  anaerobic  (e.g, 
weight  lifting)  exercise  training  programs,  (p.  79) 

These  findings  indicate  a  need  for  prolonged  and  regular  physical  activity  (American 
Psychological  Association,  1998).  Another  research  review  suggested  that  “...exercise  is  an  effec¬ 
tive  but  underused  treatment  for  mild  to  moderate  depression”  ( Tkachuk,  1999).  In  this  review, 
studies  were  analyzed  from  1 98 1  to  the  present.  In  each  study,  exercise  was  used  as  an  intervention 
in  treating  psychiatric  problems,  including  depression.  The  overall  conclusion  drawn  from  these 
studies  was  that  non-aerobic  forms  of  exercise,  such  as  strength  training,  are  as  effective  as  aerobic 
exercise  in  treating  depression.  The  studies  also  mention  that  “...less  strenuous  forms  of  regular 
exercise,  such  as  walking,  may  be  sufficient  to  demonstrate  significant  treatment  effects”  (Tkachuk, 
1999).  However,  they  note  that  more  research  is  needed  to  confirm  this  finding. 

There  is  presently  sufficient  evidence  about  exercise  as  a  mitigating  agent  for  depressive  symp¬ 
toms  to  support  the  government’s  conclusion  that  regular  physical  activity  enhances  psychological 
well-being.  Further,  physical  activity  may  even  reduce  the  risk  of  developing  depression,  may  reduce 
the  symptoms  of  ongoing  depression  and  anxiety,  and  may  generally  improve  mood.  Because  of  the 
complex  factors  influencing  depression,  a  specific  cause-and-effect  relationship  is  difficult  to  estab¬ 
lish.  The  research  certainly  indicates  that  physical  activity  may  help  the  disorder  and  that  a  mini¬ 
mal  amount  of  exercise  will  suffice. 


Stroke 


Stroke  is  the  common  name  for  cerebrovascular  disease  or  accident  (CVA).  A  stroke  occurs 
when  arteries  in  the  brain  become  narrow  and  eventually  become  clogged  from  atherosclerosis  of 
the  extracranial  and/or  intracranial  arteries  (U.S.  Department  of  Health  and  Human  Services, 
1996).  This  buildup  is  similar  to  that  which  occurs  in  the  heart  to  cause  heart  disease.  High  blood 
pressure  is  a  major  determinant  of  stroke  occurrence.  A  stroke  results  in  a  deprivation  of  blood  to 
the  brain.  This  deprivation  is  extremely  critical  since  the  brain  cells,  which  cannot  store  energy, 
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require  75  percent  of  the  body’s  (resting)  blood  glucose  to  function.  If  the  brain  cells  are  deprived 
of  blood  for  more  than  a  few  minutes,  the  cells  die.  The  result  is  impaired  vision,  speech,  motor 
function,  comprehension,  and  possibly  death  (Nieman,  1998). 

Each  year  approximately  500,000  Americans  suffer  from  a  stroke.  The  physical  effects  of  a 
stroke  are  quite  severe,  resulting  in  death  within  one  year  for  30  percent  of  its  victims,  and  within 
eight  years  for  60  percent  (American  Heart  Association,  1997).  Stroke  ranks  as  the  third  leading 
cause  of  death  following  heart  disease  and  cancer  ( NIH,  1996).  Those  who  do  not  die  from  a  stroke 
have  a  50  percent  chance  of  becoming  functionally  impaired  enough  to  require  assistance  in  caring 
for  themselves.  With  such  devastating  effects,  the  prevention  of  stroke  occurrence  is  a  significant 
health  and  economic  concern. 

There  are  few  studies  that  specifically  relate  physical  activity  to  the  primary  prevention  of 
stroke.  However,  the  evidence  is  mounting  to  indicate  that  exercise  is  a  significant  primary  pre¬ 
ventative  measure.  Physical  activity  also  has  secondary  beneficial  effects  on  stroke  by  preventing  or 
reducing  the  impact  of  other  risk  factors  such  as  hypertension  (Sacco  et  al.,  1998).  According  to  the 
American  Heart  Association  (AHA),  reducing  the  risk  factors  is  the  most  effective  way  to  prevent 
a  stroke  from  occurring.  AHA  estimates  that  70  percent  of  all  strokes  occur  in  people  who  have 
high  blood  pressure.  The  other  major  risk  factors  are  cigarette  smoking,  excessive  alcohol  intake, 
and  high  blood  cholesterol  (AHA,  1996).  Lifestyle  factors  are  also  linked  to  stroke  indirectly. The 
finding  that  the  rate  of  stroke  occurrence  fell  70  percent  between  1950  and  1993  is  attributed  to 
changes  toward  a  more  active  and  healthy  lifestyle  that  were  occurring  during  this  period. 
According  to  research  into  the  effects  of  migration  on  health,  men  who  were  born  in  Japan  (which 
has  very  high  stroke  death  rate)  and  moved  to  California  were  found  to  have  decreased  their  stroke 
death  rate  by  50  percent  (Bronner  et  al.,  1995). 

The  relationship  between  physical  activity  and  many  stroke  risk  factors  (hypertension,  obesity, 
diabetes,  high  blood  cholesterol)  is  very  strong.  Thus,  the  current,  general  recommendations  for 
exercise,  emphasizing  leisure-time  physical  activity,  may  be  just  as  important  in  mitigating  the  risk 
of  CVA  as  they  are  to  other  conditions.  In  order  to  examine  directly  the  association  between  leisure 
time  physical  activity  and  stroke,  Sacco  et  al.,  studied  369  subjects  with  a  first  stroke  and  678  con¬ 
trol  subjects  who  were  matched  for  age,  sex,  and  race-ethnicity. The  case  subjects  were  interviewed 
within  a  median  of  4  days  after  stroke  onset.  Each  was  asked  to  report  the  frequency  and  duration 
of  14  different  recreational  activities  during  the  two  weeks  before  the  stroke.  The  researchers 
adjusted  for  cardiac  disease,  peripheral  vascular  disease,  hypertension,  diabetes,  smoking,  alcohol 
use,  obesity,  medical  reasons  for  limited  physical  activity,  education,  and  season  of  enrollment  in  the 
study.  After  these  adjustments  were  made,  a  significant  benefit  of  leisure-time  physical  activity  was 
observed  in  all  age,  sex,  and  racial-ethnic  groups.  A  positive  dose-response  relationship  was  found 
for  both  intensity  and  duration  of  physical  activity  as  well.  Simply  stated  this  means  that  physical 
activity  has  a  positive  impact  on  reducing  the  risk  of  CVA  no  matter  who  you  are,  and  that  the  more 
exercise  you  do,  the  better. 

In  addition,  a  study  in  Great  Britain  investigated  the  physical  activity  levels  of  151  stroke 
patients  and  161  controls  (Shinton  6c  Sagar,  1993). The  researchers  found  a  direct  positive  corre¬ 
lation  between  increases  in  duration  of  activity  in  the  years  before  the  study  and  an  increase  in  pro¬ 
tection  from  stroke  risk.  Risk  of  stroke  dropped  56  percent  in  those  who  had  engaged  in  regular 
and  vigorous  exercise  between  the  ages  of  15  and  25  with  additional  protection  for  those  who  con- 
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tinued  exercising  in  adulthood.  Shinton  noted  that  vigorous  exercise  early  in  life  seems  to  have  a 
particular  benefit,  and  that  a  lifelong  exercise  program  offers  the  best  health  protection. 

These  studies  support  the  beneficial  effects  of  increasing  activity  as  an  effective  countermeasure 
to  stroke  risk.  Interestingly,  other  studies  have  found  little  or  no  additional  benefit  to  high  levels  of 
exercise  over  moderate  levels  of  physical  activity,  (Kiely  et  al.,  1994).  Although  the  evidence  is  still 
being  analyzed  in  greater  detail,  the  benefits  of  regular  moderate  exercise  to  reduce  the  risk  of 
stroke  is  convincing.  By  reducing  risk  factors  such  as  hypertension  and  obesity,  which  are  clearly 
related  to  the  incidence  of  stroke,  a  protective  effect  has  been  shown  to  occur  to  some  degree.  The 
preponderance  of  research  findings  seems  to  agree  that  some  physical  activity  on  a  regular  basis 
minimizes  stroke  risk.  The  current  standards  for  health  maintenance  already  incorporate  recom¬ 
mendations  for  such  moderate  levels  of  physical  activity  and  further  research  may  provide  even 
more  details  regarding  the  differential  benefits  of  varying  levels  of  exercise  intensity. 


IV1  soul  skei  t  I  Problems 


There  are  a  multitude  of  problems  and  injuries  that  can  occur  because  of  weak  muscles  sur¬ 
rounding  thejoints.  The  musculoskeletal  system  has  two  kinds  of  connective  tissue  that  support  the 
joints:  tendons,  which  link  muscles  to  bones,  and  ligaments,  which  link  bones  to  bones.  The 
Surgeon  General  states  that — 

Extensive  animal  studies  indicate  that  ligaments  and  tendons  become  stranger  with  prolonged  and 
high-intensilly  exercise.  The  effect  is  the  result  (fan  increase  in  the  strength  of  the  inseitionsites  between 
ligaments  and  tendons.  These  structures  also  become  weaker  and  smaller  with  several  weeks  of  immo¬ 
bilization,  which  can  have  implications  for  musculoskeletal  performance  and  risk  of  injury,  (p.  69) 

One  of  the  most  common  injuries  is  low  back  pain,  which  affects  an  estimated  75  million 
Americans  each  year.  Such  widespread  suffering  is  unnecessary,  as  80  percent  of  the  time  it  can  be 
prevented  (Hoeger  &  lloeger,  1998).  Low  back  pain  is  commonly  caused  by — 

1.  physical  inactivity, 

2.  poor  postural  habits  and  body  mechanics,  and 

3 .  excessive  body  weight. 

Essentially,  weak  abdominal  and  back  muscles,  poor  flexibility  (especially  of  the  lower  back  and 
hamstrings),  and  an  abundance  of  fat  lead  to  back  problems.  In  theory,  if  the  muscles  of  the 
abdomen  and  gluteal  regions  are  not  strong  enough  to  support  the  spine  and  the  weight  surround¬ 
ing  it,  then  an  unnatural  forward  tilt  occurs  in  the  pelvis.  This  tilt  causes  a  curvature  in  the  lower 
back  and  puts  pressure  on  the  spine,  leading  to  low  back  pain.  Sometimes  the  problem  can  be  erad¬ 
icated  by  stretching  and  strengthening  tight  and  weak  muscles. 

The  evidence  is  mixed,  however,  when  it  comes  to  poor  musculoskeletal  fitness  as  a  predictor  of 
back  pain.  Some  studies  have  found  weaker  muscles  correlating  to  lower  back  pain  (Lee,  Boreskie, 
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Law,  Sc  Russell,  1995),  while  others  found  little  evidence  of  such  a  correlation  (Ready  et  at.,  1993). 
A  cycle  of  inactivity  is  associated  with  back  pain,  thus  creating  weaker  muscle  that  cannot  effective¬ 
ly  support  the  spine  and  may  cause  more  problems  that  are  more  difficult  to  eradicate  through  move¬ 
ment.  Other  researchers  (Malmivaara  &Aro,  1995)  have  found  that  normal,  moderate  activity  is 
most  effective  in  treating  back  pain.  This  particular  study  assigned  back-pain  patients  to  either  bed 
rest,  regular  activity,  or  back  exercises.  The  patients  who  resumed  ordinary  activity  recovered  faster 
than  those  who  stayed  in  bed  or  those  who  exercised. 
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Health  and  safety 
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Health  benefits 
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Heart  disease 
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Heart  rate 
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Height 
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Hypertension 
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Individual  differences 
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Intercept  differences 
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166, 168, 179-182, 184-192, 194-195, 197, 207,  209-210, 213,  216 
Job  performance  criteria 

158,168 

Job  placement 

202 

Job  specific 
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Muscle  strength  and  endurance 

48 

Musculoskeletal 

29,  33-35,  55,  71,  76,  90,  96, 122, 128, 130,  159,237 
Musculoskeletal  problems 

35,239 

Navy 

5-9, 13-25, 52-53,  81,  94, 114-115, 124, 133, 167 
Navy  fitness 
14-15 

Navy  Fitness  Program 
14-15 

Normative  data 

118,179-182,191,196 

Norm-based 

1,10 

Norms 

4,  13,  17,90,109,  118,173 

Obesity 

14. 17. 19. 23,  32-34,  54,  56,  59-64, 113, 115, 128-129, 132, 134, 159, 176, 223-228, 230, 
232-234,236-238,240,242 

Occupational  assessment 
1-3 

Occupational  demands  measurement 

1-2 

Occupational  fitness 

1-3, 6-7, 11, 13, 15, 17, 20,  67-68,  72,  89 

Occupational  standards 

7, 17,  20 

Occupational  tasks 

22,  77,  80, 101, 173, 198 
Osteoporosis 

33,  42,  54,  60,  64,  231-232,  241 

Overweight 

33,  49,  56,  58-59,  61-63, 113, 115, 128-129,  223-228, 230,  232,  234,  236, 238, 240, 242 
Oxygen  uptake 

5,  46,  48,  78-80,  82-83,  93,  97, 102-104, 123, 126, 133, 135-136, 147, 162, 164, 205 
Passing  scores 

179-181, 185-187, 189, 191-193, 196 
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Perceived  exertion 


76,  80,  91,  93-94, 132, 164, 172 
Percent  body  fat 

8, 11, 15, 17, 22, 25, 53,  64, 107, 109-117, 124-127, 133, 147, 150-151, 159-162, 170, 205, 
223,241 

Performance 

2-3,  6-7,  9, 11, 13, 15-23, 27-32,  34,  36,  39,  45, 49-51,  53-55,  57,  61-63,  67-69,  71-75, 
77-78,  81,  83-84,  88-89,  95-98,  101-102,  104-106, 108, 110,  112, 114,  116, 118, 120, 
122-128, 130-141, 143, 145-146, 148, 151-152, 154, 158, 160-168, 171, 173-176, 179-202, 
206-207,209-210,213,216,218-220,237,239-240 

Performance  standards 

3,  7, 9, 16, 21, 23,  62,  179-180, 182, 184, 186, 188, 190, 192, 194, 196, 198,200 

Physical 

1-25,  27-47,  49-65,  67-70,  72-73,  75-78,  80-88,  90-92,  94-98, 101-102, 104, 106-108, 
110, 112, 114, 116, 118-122, 124, 126, 128, 130-140, 142, 144, 146-152, 154, 156, 158-164, 
166-178, 180, 183, 186-188,  191-199,  202-205,  207-211,  213,  216-220,  223-231,  233-241 

Physical  activity 

15, 27-45, 47,  54-65,  80,  82,  86, 107-108, 119, 132, 136, 219,  224-231,  233-241 
Physical  Demands  Analysis 

67-70,  72-73,  75-76,  86-88,  90-92 

Physical  fitness 

1,  3-6,  8-17, 20-25, 27-30,  34,  37,  45-47,  49-54,  56-63,  65,  67-68,  80,  94-96, 101-102, 
128,  131-133, 136, 163, 172-173,  191, 198,  219,  223,  226,  238-240 
Physical  fitness  assessment 

4,  8, 15 

Physical  fitness  measurement 

4 

Physical  performance  tests 

51,  57,  97, 101-102, 104, 106, 108, 110, 112, 114, 116, 118, 120, 122, 124, 126, 128, 130, 
132,134,136,138,160-161,163-164,193,197-199 

Physical  testing 

159,191,193,195,202,205,208,218 

Physical  tests 

25, 139, 146-147, 158, 167, 171,  191-196, 199,  205, 208-210 
Physiological  validation 

139,143,146,148,152,156,158-159,171,202 
Physiological  parameters 

208,211 

Physiological  tests 
146-147,153,179 
Physiology 

23,  62-64,  79,  84,  94-98,  123, 131, 133-137,  148,  172,  175,  177,  197,  199,  202-203,  213, 
218-220,239-240 


252 


Appendix  A:  Overweight  and  Obesity 


Placement 


97,116,128,160,180,202 

Posture 

71-72,  74,  77-78,  88-90,  93 

Preemployment  testing 
118,202,208 
Production  rate 

158 

Protected  group 

148.159.201.204- 205,208 

Psychological  tests 

139,146-147,153,171 
Psychometric  tests 

203 

Psychometric  test  theory 

203 

Readiness  and  fitness 

49 

Recommendations  from  the  American  Heart  Association 

43 

Recommendations,  activity  level 
40-44 

Regression  slopes 
148-152 
Run  field  tests 
105 
Scores 

3, 10, 13, 15, 17-19,  72,  89, 122, 143-145, 147-148,  153, 155, 165, 179-182, 185-189, 
191-197,213 
Scores,  passing 

179-181,185-187,189,191-193,196 

Selection 

16, 20, 28,  47,  54,  56,  68,  70,  73,  76,  93-94,  97,  128, 139-146, 148, 150,152, 154, 156, 
158-160, 162, 164, 166, 168, 170-174, 176-178, 180-181, 185-187, 189, 191-192, 197-199, 

201.204- 208,212,218 

§  election  method 

201,204,206-207 

Standards 

1-24,  26-28,  30-32,  34,  36,  38-40,  42,  44,  46-54,  56-58,  60-64,  66-69,  73,  82-84,  89-90, 
95-96,  109-110,  113-114,  116-117,  133, 139, 144-147,  151, 153,  159,  164, 168-169, 
172-173, 175, 179-182, 184, 186, 188, 190-192, 194-200,  202, 205, 208-209, 214-215, 219, 
237,240 
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Standards,  body  fat 

8, 13-15, 17, 19, 116-117 
Standards,  fitness 

13,  48-50,  56 

Standards,  height  and  weight 

114,202,208-209 

Standards,  performance 

3,  7,  9, 16, 21,  23,  62, 179-180, 182, 184,  186,188,190,  192,194,196,198,200 

Steel  workers 

163 

Strain 

67-69,  71-72,  77-78,  80-82,  84,  87-88,  93, 119 

Strength 

1,  3-11, 13-14, 16-20,  22-23, 27,  29-30,  36,  38-43,  46,  48-49,  51,  55-57,  63-64,  68,  72,  89, 
92,  94-98, 101-102, 116-120, 122-128, 130-132, 134-136, 141, 143, 148-149, 151-158, 
160-172, 174-175, 179, 181, 192-194, 199, 205, 209-212, 215-217, 219-220,  225, 231-232, 
235,237,241 
Stress 

50,  67-69,  71-72,  77,  80-81,  84,  86,  91,  93,  98, 137, 146, 148, 158, 169-170, 173, 176, 199, 
220,228,232 

Stroke 

33-34,  55,  58,  61-62,  64, 104, 115, 122, 129, 131,  227-228, 235-237, 239 

Sub  maximal  aerobic 

104 

Success  in  training 

158 

Supervisory  ratings 

158 

Task 

2-3,  7, 16,  29,  50,  53,  62,  67,  69-79,  81-82,  88-96, 102, 121-122, 128, 130-131, 136, 
139-143, 146-156, 158-166, 169-171, 175, 183-186, 193, 205,  211, 215-216,  220 

Task  performance 

16, 29,  69,  71-72,  77-78,  88-89,  95, 128, 152, 160-161 

Tasks,  manual  lifting 

168 

Task-specific 

27,  51,  57,  76 

Taylor 

2-3,  20, 23-24,  93-94,  96-98, 103, 136, 176, 181, 186-187, 196, 199 

Taylor-Russell  Tables 

179,184,186-187 

Temperature 

68,  72,  79,  82,  84-86,  93,  96-97, 112 
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Teman 

4 

Test 

1,  4-5,  7,  9-18,  20-24,  45,  47,  49-51,  53, 56-57,  83,  89,  96-98, 101-108, 114-128, 130-133, 
136, 139-154, 156-174, 176-182, 185-199,  201-210, 212-219, 228 
Test  fairness 

148,179,193-197 
Test  fairness 

148,179,193-197 
Test  validation 

139,147-149,153,158-159,171,206 

Tests,  physiological 
146-147,153,179 
Tests,  psychological 

25 

Tests,  psychometric 

203 

Thorndike 
3-4,  24 
Thresholds 
51,  96 

Title  VII 

180,201,203-206,209-210,213 

Underground  coal  mining 
164 
Valid 

9, 18,  30,  52-53,  69-70,  87, 103, 105-106, 108, 121-122, 124, 126-127, 130, 140, 146-147, 
152-154, 159, 161, 168-169, 180-181, 194-196,  201-202, 206, 208, 210-211 

Validation  study 

145, 155, 159-160, 168, 185,  201-202, 207,  216-217 
Validation,  physiological 

139,143,146,148,152,156,158-159,171,202 

Validity 

13, 22,  27,  57,  89,  92, 121-123, 126, 131, 134, 136, 139-141, 143-148, 153, 155, 162, 164, 
166-167, 171, 174-176, 179, 181-182, 186-187, 192, 194-196, 199, 206-207,  210-211, 214, 
220 

Validity,  criterion-related 

139-140 

Work  physiology 

79,  94, 123, 131, 172, 197,  203, 218 
Work  sample  tests 

101-102,121-125,130-131,161-164 
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YMCA  Test  Battery 

5 
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